# Charles L. Odoroff Memorial Lecture

In honor of Dr. Charles L. Odoroff, the founding Director of the Division (now Department) of Biostatistics at the University of Rochester, the Department hosts an annual lecture by a distinguished statistician. This lecture series is supported by funds contributed by family and friends of Dr. Odoroff after his untimely death in 1987 at age 49.

## 2023 Charles L. Odoroff Memorial Lecture

**Ed George, PhD**

Universal Furniture Professor Emeritus of Statistics and Data Science

The Wharton School of the University of Pennsylvania

**Bayesian Mortality Rate Estimation: Calibration and Standardization for Public Reporting**

Bayesian models are increasingly fit to large administrative data sets and then used to make individualized recommendations. In particular, Medicare’s Hospital Compare (MHC) webpage provides information to patients about specific hospital mortality rates for a heart attack or Acute Myocardial Infarction (AMI). MHC’s recommendations have been based on a random-effects logit model with a random hospital indicator and patient risk factors. Except for the largest hospitals, these recommendations or predictions are not individually checkable against data, because data from smaller hospitals are too limited. Before individualized Bayesian recommendations, people derived general advice from empirical studies of many hospitals, e.g., prefer hospitals of type 1 to type 2 because the observed mortality rate is lower at type 1 hospitals. Here we calibrate these Bayesian recommendation systems by checking, out of sample, whether their predictions aggregate to give correct general advice derived from another sample. This process of calibrating individualized predictions against general empirical advice leads to substantial revisions in the MHC model for AMI mortality, revisions that hierarchically incorporate information about hospital volume, nursing staff, medical residents, and the hospital’s ability to perform cardiovascular procedures. And for the ultimate purpose of meaningful public reporting, predicted mortality rates must then be standardized to adjust for patient-mix variation across hospitals. Such standardization can be accomplished with counterfactual mortality predictions for any patient at any hospital. It is seen that indirect standardization, as currently used by MHC, fails to adequately control for differences in patient risk factors and systematically underestimates mortality rates at the low volume hospitals. As a viable alternative, we propose a full population direct standardization which yields correctly calibrated mortality rates devoid of patient-mix variation. (This is joint work with Veronika Rockova, Paul Rosenbaum, Ville Satopaa and Jeffrey Silber).

**Thursday, April 27, 2023 at 3:30 p.m.
Helen Wood Hall Room 1W-501**

## Previous Lectures

### 2022 Charles L. Odoroff Memorial Lecture

**Susan Murphy, PhD **

Mallinckrodt Professor of Statistics and of Computer Science,

Radcliffe Alumnae Professor at the Radcliffe Institute, Harvard University

**Assessing Personalization in Digital Health**

Reinforcement Learning provides an attractive suite of online learning methods for personalizing interventions in Digital Behavioral Health. However after a reinforcement learning algorithm has been run in a clinical study, how do we assess whether personalization occurred? We might find users for whom it appears that the algorithm has indeed learned in which contexts the user is more responsive to a particular intervention. But could this have happened completely by chance? We discuss some first approaches to addressing these questions.

Thursday, May 19, 2022

### 2021 Charles L. Odoroff Memorial Lecture

Delayed due to COVID restrictions

### 2020 Charles L. Odoroff Memorial Lecture

Delayed due to COVID restrictions

### 2019 Charles L. Odoroff Memorial Lecture

**Sharon-Lise Normand, PhD **

S. James Adelstein Professor of Health Care Policy

Department of Health Care Policy, Harvard Medical School

Department of Biostatistics, Harvard T.H. Chan School of Public Health

*Some Inferential Tools for Health Policy & Outcomes Research*

Health policy and outcomes research involves assessing access, effectiveness, quality and value of health care delivered in routine circumstances of everyday practice. Identifying what works for whom and when in usual care settings often relies on observational data. This talk will describe some common statistical issues encountered in outcomes research including the analysis of multiple related outcomes, clustered data, and risk adjustment for causal inference. Statistical approaches to these problems will be discussed. Drug reformulation policies, medical device safety, and disparities in healthcare illustrate issues will be addressed. *(Research presented was funded, in part, by R01-GM111339 from the National Institute of General Medical Sciences and R01-MH106682 from the National Institute of Mental Health).*

Thursday, May 9, 2019

### 2018 Charles L. Odoroff Memorial Lecture

**Per Kragh Andersen, PhD **

Professor

Department of Biostatistics

University of Copenhagen

*Multi-State Models in Medical Research*

In longitudinal studies of patients, a number of disease states can often be identified. Thus, patients who have undergone bone marrow transplantation may experience graft-versus-host disease, experience a relapse, or die in remission; patients with affective disorders may be in psychiatric hospital, out of hospital, or have died, and patients with diabetes may experience diabetic nephropathy, they may die from cardiovascular causes, or die from other causes. A suitable mathematical framework for modeling data from such longitudinal studies is that of multi-state models. In such models, the basic parameters are the intensities of transition between the states from which other ('marginal') parameters of interest – such as state occupation probabilities, average time spent in a given state, and expected number of recurrent events - may, in principle, be derived. We will briefly review classical methods for analyzing transition intensities, including the Cox regression model and other hazard models. However, we will also discuss methods by which such marginal parameters may be directly targeted, i.e. without going via the intensities. In particular, we will discuss how marginal parameters may be analyzed using pseudo-observations. The methods will be illustrated via examples from hematology, psychiatry, and other medical fields

Thursday, May 10, 2018

### 2017 Charles L. Odoroff Memorial Lecture

**Colin B. Begg, PhD **

Professor and Chairman

Department of Epidemiology & Biostatistics

Memorial Sloan Kettering Cancer Center

**Distinguishing Second Primary Cancers from Metastases: Statistical Challenges in Testing Clonal Relatedness of Tumors**

The pathological diagnosis of a new tumor in a patient with an existing or previous cancer can be challenging for pathologists in certain clinical settings. Modern techniques for molecular analysis of the two tumor specimens to detect common somatic changes offers the promise of more accurate classification of the second tumor as either a metastasis or a second independent primary. In the past few years our group has examined how to construct statistical tests for clonal relatedness in this setting using the somatic profiles of the tumors. This work has evolved as the genetic technologies available have evolved. The topic provides a snapshot of the challenges of trying to develop valid statistical techniques in a real world setting where there are many threats to the validity of the methods.

Thursday, April 27, 2017

### 2016 Charles L. Odoroff Memorial Lecture

**Roderick Little, PhD **

Richard D. Remington Distinguished University Professor

Department of Biostatistics

University of Michigan

**Treatment Discontinuation and Missing Data in Clinical Trials**

I briefly review recommendations of a recent National Research Council study on the treatment of missing data in clinical trials. I then discuss two aspects of the analysis of clinical trials when participants prematurely discontinue treatments. First, I formulate treatment discontinuation and “analysis dropout” as different missing data problems: analysis dropout concerns missing data on outcomes, and treatment discontinuation concerns missing data for a covariate defining principal strata for completion under alternative treatments. Second, the choice of estimand is an important factor in considering when follow-up measures after discontinuation are needed to obtain valid measures of treatment effects. Follow-up measures are recommended for the standard choice of intention to treat (ITT) estimand, the average effect of randomization to treatment, but this measure has the disadvantage that it includes the effects of any treatments taken after the assigned treatment is discontinued. I discuss alternative estimands for the ITT population, and argue that one in particular, an on-treatment summary of the effect of treatment prior to discontinuation, should receive more consideration. Ideas are motivated and illustrated by a reanalysis of a past study of inhaled insulin treatments for diabetes, sponsored by Eli Lilly.

Thursday, May 5, 2016

### 2015 Charles L. Odoroff Memorial Lecture

**Marie Davidian, PhD **

William Neal Reynolds Professor

Department of Statistics

North Carolina State University

*The Right Treatment for the Right Patient (at the Right Time): Personalized Medicine and Dynamic Treatment Regimes*

With the advent of the 'omics era, achieving the goal "personalizing" treatment to the patient based on his/her genetic/genomic as well as physiological, demographic, and other characteristics and past history has become more promising than ever. One perspective on personalized medicine involves identifying subgroups of patients sharing certain characteristics who are likely to benefit from a specific treatment or to whom a new treatment may be targeted ("the right patient"). Another is based on formalizing how clinicians make treatment decisions in practice, where the goal is to identify the most beneficial treatment to administer to a patient from among the available options given his/her characteristics ("the right treatment"). In chronic diseases and disorders like cancer or substance abuse, a series of treatment decisions must be made, and the objective is to determine the "best" treatment option for a patient at each decision given all information accrued on the patient to that point, including responses to previous treatments, so as to lead to the most beneficial long term outcome. This sequential decision-making introduces many complications; for example, treatments chosen early on may affect how well treatments given later will work ("at the right time").The development of optimal, evidence-based personalized treatment strategies (dynamic treatment regimes) can be formulated as a fascinating statistical problem. I will provide an overview of challenges involved and of study designs and methodological developments that can be used in the quest for personalized medicine.

Thursday, April 30, 2015

### 2014 Charles L. Odoroff Memorial Lecture

**Thomas A. Louis, PhD **

Professor, Department of Biostatistics

Johns Hopkins Bloomberg School of Public Health

*Research at Census, with Links to Biostatistics *

In order to meet the challenges of efficiently obtaining valid demographic, economic, and activity-based information, making it available to the public, and protecting confidentiality, research at the U.S. Census Bureau and other federal statistical agencies, indeed survey research more generally, burgeons. Many research goals and methods are similar to those addressed by and used in Biostatistics or Informatics. To set the scene, I briefly describe the Census Research & Methodology directorate, list major issues and approaches, then provide details on a small subset. Candidate topics include adaptive design (dynamic survey modes, R-factors in the National Survey of College Graduates, timing of mailing hard copy based on K-M curves, challenges of learning from experience), stopping rules, randomized experiments (the effect of interviewer training in the National Crime Victimization Survey), record matching, prediction (of response propensity, of occupancy, of the “fitness for use” of administrative records), imputation, Bayesian methods (design-consistent analyses, post-processed {confidence} intervals, benchmarking), small area/spatio-temporal analysis (estimation of poverty rates, estimating omissions in the master address file), development and use of paradata (in the National Health Interview Survey), double-robustness, dynamic data posting (“OnTheMap” Local Origin-Destination Employment Statistics), disclosure avoidance/limitation, Big Data (opportunities and challenges), micro-simulation (benefits of research in designing the 2020 Census), and IT infrastructure (the Multi-mode Operational Control System). I close with a call for increased collaboration among statistical agencies and academe, building on the NSF-Census Bureau Research Network. Visit (http://www.census.gov/research/) for some background.

Thursday, May 1, 2014

### 2013 Charles L. Odoroff Memorial Lecture

**Jack Kalbfleisch, PhD **

Professor Emeritus of Statistics and Biostatistics

The University of Michigan School of Public Health

**Randomization and re-randomization in clinical trials**

Randomization was a key contribution of Sir Ronald Fisher to the conduct of scientific investigations and statistical methods. Along with the protective aspects of randomization, Fisher also noted that the distribution induced by randomization can form the basis of inference. Indeed, in some instances, the randomization test and related procedures seem to be the only tools available for inference. Several authors have noted the advisability of rerandomizing if, in a particular instance, the observed randomization leads to an unacceptable degree of imbalance in important factors between and among the treatment groups. Morgan and Rubin (2012, Annals of Statistics) provide an excellent discussion and some interesting results. This talk begins with some discussion of randomization and then considers problems arising in the design of relatively small cluster randomized trials, which have been widely used in recent years for evaluation of health-care strategies. The balance match weighted (BMW) design, introduced in Xu and Kalbfleisch (2010, Biometrics), applies propensity score matching ideas to choose a design through a rerandomization approach with the general aim of minimize the mean squared error of the treatment effect estimator. The methods are evaluated by simulation. Extensions of the methods to multiple armed trials are also considered and simply implemented numerical methods are proposed to achieve good matching algorithms that achieve near optimum results. Analysis issues are also discussed. Standard parametric and nonparametric methods are often inappropriate for analysis of designs involving rerandomization, though the distribution generated by the rerandomization approach provides a general framework for analysis. With the matching approach of the BMW design, the use of analysis models that respect the matching are also investigated.

This is based on joint work with Dr. Zhenzhen Xu.

Thursday, April 25, 2013

### 2012 Charles L. Odoroff Memorial Lecture

**Nancy L. Geller, PhD **

Director, Office of Biostatistics Research

National Heart, Lung, and Blood Institute

**Has the time come to give up blinding in clinical trials?**

Should all trials be double blinded, that is, should treatment allocation be concealed from both the subjects and those administering the treatment? In the late 1980's and early1990's trialists advocated strongly for double blinding of clinical trials, yet in the past 15 years, we have seen more and more clinical trials that are unblinded. While it is relatively easy to make a placebo controlled trial of a medication given orally double blinded, reasons for not blinding include that in some situations it is too difficult (or expensive) to blind, in some situations, it may be unethical to blind and in other situations, it is impossible to blind. Complex interventions may make blinding especially difficult. Comparative effectiveness studies also encourage unblinded trials because “blinding is not done in the real world.” We give several examples of recent trials which have not been blinded and examine the consequences.

Thursday, May 3, 2012

### 2011 Charles L. Odoroff Memorial Lecture

**Amita K. Manatunga , Ph.D.**

Professor

Rollins School of Public Health

Emory University

**A Framework for the Assessment of Disease Screening Instruments in Mental Health Studies**

A fundamental objective in biomedical research is to establish valid measurements of the clinical disease of interest. Measures of agreement have been widely used for validating a new instrument by assessing similarity of measurements with an established instrument.

Although the foundation of agreement methodology has been mostly laid out, many important statistical issues have not yet been resolved. In this presentation, I will present our recent work on the following two problems: (1) how to extend the classical framework of agreement to evaluate the capability of interpreting a continuous measurement in an ordinal scale; (2) how to subdivide a continuous scale into ordered categories when there is high correspondence between two scales.

To address the first question, we propose a new concept, called “broad sense agreement", which characterizes the correspondence between a continuous scale and an ordinal scale. We present a natural measure for broad sense agreement. Nonparametric estimation and inference procedures are developed for the proposed measure along with theoretical justifications. To address the second question, we develop a new approach for determination of cut-points in a continuous scale according to an established categorical scale by adopting the idea of optimizing the agreement between the discretized continuous scale and the categorical scale. We also discuss analytic and empirical advantages of our method. Finally, we apply these methods to a mental health study to illustrate their practical utility.

Thursday, May 26, 2011

### 2010 Charles L. Odoroff Memorial Lecture

**Raymond J. Carroll, Ph.D.**

Distinguished Professor of Statistics, Nutrition and Toxicology

Texas A&M University

**Robust Powerful Methods for Understanding Gene-Environment Interactions**

We consider population-based case-control studies of gene-environment interactions using prospective logistic regression models. Data sets like this arise when studying pathways based on haplotypes as well as in multistage genome wide association studies (GWAS). In a typical case-control study, logistic regression is used and there is little power for detecting interactions. However, in many cases it is reasonable to assume that, for example, genotype and environment are independent in the population, possibly conditional on factors to account for population stratification. In such as case, we have developed an extremely statistically powerful semiparametric approach for this problem, showing that it leads to much more efficient estimates of gene-environment interaction parameters and the gene main effect than the standard approach: decreases of standard errors for the former are often by factors of 50% and more. The issue of course that arises is the very assumption of conditional independence, because if that assumption is violated, biases result so that one can announce gene-environment interactions or gene effects even though they do not exist. We will describe a simple, computationally fast approach for gaining robustness without losing statistical power, one based on the idea of Empirical Bayes methodology. Examples to colorectal adenoma studies of the NAT2 gene and prostate cancer in the VDR pathway are described to illustrate the approaches.

Friday, April 16, 2010

### 2009 Charles L. Odoroff Memorial Lecture

**Terry M. Therneau , Ph.D.**

Professor of Biostatistics

Mayo Clinic

*Random Effects Models and Survival Data*

The literature on random effects models for survival data, also known as frailty models, has burgeoned in the last few years. With multiple choices for the distribution, interrelations, and computation of the random effects it has been fruitful soil for theoretical forays. This talk will focus on the practical uses of the models: what software is readily available, what types of problems can we apply it to, and most importantly, what does the approach add to our final clinical or biological understanding.

Thursday, April 2, 2009

### 2008 Charles L. Odoroff Memorial Lecture

**Marvin Zelen, Ph.D.**

Lemuel Shattuck Research Professor of Statistical Science

Department of Biostatistics

Harvard School of Public Health

*Early Detection of Disease and Stochastic Models*

The early detection of disease presents opportunities for using existing technologies to significantly improve patient benefit. The possibility of diagnosing a chronic disease early, while it is asymptomatic, may result in diagnosing the disease in an earlier stage leading to better prognosis. Many cancers, diabetes, tuberculosis, cardiovascular disease, HIV related diseases, etc. may have better prognosis when combined with an effective treatment. However gathering scientific evidence to demonstrate benefit has proved to be difficult. Clinical trials have been arduous to carry out, because of the need to have large numbers of subjects, long follow-up periods and problems of non-compliance. Implementing public health early detection programs have proved to be costly and not based on analytic considerations. Many of these difficulties are a result of not understanding the early disease detection process and the disease natural histories. One way to approach these problems is to model the early detection process. This talk will discuss stochastic models for the early detection of disease. Breast cancer will be used to illustrate some of the ideas. The talk will discuss breast cancer randomized trials, stage shift and benefit, scheduling of examinations, issue of screening younger and older women and the probability of over diagnosis of disease.

Tuesday, April 29, 2008

### 2007 Charles L. Odoroff Memorial Lecture

**Mark van der Laan, Ph.D.**

University of California, Berkeley

*Targeted Maximum Likelihood Learning of Scientific Questions*

The main point of this presentation is that choice of a statistical model and method must be based on careful consideration of the scientific question of interest in order to provide robust tests of the null hypothesis and to minimize bias in the parameter estimate. For this purpose we developed a new generally applicable targeted maximum likelihood estimation methodology.

As an example, I will distinguish between scientific questions concerned with prediction of an outcome based on a set of input variables versus scientific questions in which the goal is to estimate the variable importance or causal effect of one particular variable/treatment. I will show the limitations of fitting regression models for the purpose of learning about a causal effect or variable importance, and present the alternative targeted maximum likelihood approach. Both observational studies and randomized trials will be used to illustrate the advantages of the targeted approach. I will present results from data analyses in which the targeted approach is used to 1) analyze the importance of each of a set of HIV mutations for protease inhibitor resistance and 2) estimate the causal effect of interventions to improve adherence to antiretroviral drugs.

The differences between prediction and causal effect estimation are further highlighted by the additional assumptions needed for the estimation of the causal effect of an intervention in an observational study. Beyond the familiar ''no unmeasured confounding'' assumption, causal effect estimation also requires an experimental treatment assignment assumption, violation of which can cause severe bias and increased variance in a causal effect estimate. To address this problem, I will show that estimation of the causal effect of a "realistic" intervention (similar to the parameter one estimates in an intention-to-treat analysis) provides an important generalization which can always be fully identified from the data. Targeted estimators of this realistic parameter are also available.

Finally, I will discuss the advantages of applying targeted estimation in the context of a randomized trial. Like standard approaches, the targeted approach relies only on the randomization assumption. However, the targeted approach yields an improved estimate of the causal effect of a treatment in a randomized trial relative to the commonly used marginal estimate of the treatment effect.

Thursday, September 20, 2007

### 2006 Charles L. Odoroff Memorial Lecture

**Butch Tsiatis, Ph.D.**

Department of Statistics

North Carolina State University

*Estimating Mean Response as a Function of Treatment Duration in an Observational Study, Where Duration May be Informatively Censored*

In a recent clinical trial "ESPRIT" of patients with coronary heart disease who were scheduled to undergo percutaneous coronary intervention (PCI), patients randomized to receive Integrilin therapy had significantly better outcomes than patients randomized to placebo. The protocol recommended that Integrilin be given as a continuous infusion for 18--24 hours. There was debate among the clinicians on the optimal infusion duration in this 18--24-hour range, and we were asked to study this question statistically. Two issues complicated this analysis: (i) The choice of treatment duration was left to the discretion of the physician and (ii) treatment duration would have to be terminated (censored) if the patient experienced serious complications during the infusion period. To formalize the question, "What is the optimal infusion duration?" in terms of a statistical model, we developed a framework where the problem was cast using ideas developed for adaptive treatment strategies in causal inference. The problem is defined through parameters of the distribution of (unobserved) potential outcomes. We then show how, under some reasonable assumptions, these parameters could be estimated. The methods are illustrated using the data from the ESPRIT trial.

Thursday, May 11, 2006

### 2005 Charles L. Odoroff Memorial Lecture

**Louise Ryan, Ph.D.**

Professor of Biostatistics

Harvard School of Public Health

*Prenatal Methylmercury Exposure and Childhood IQ*

Controversy continues regarding the impact of chronic methylmercury exposures on childhood development. Adverse effects are difficult to quantify at low doses, and conflicting results have been obtained from several well-designed epidemiological studies, one in the Faroe Islands, one in the Seychelles and an older small study in New Zealand. We describe the use of hierarchical modeling techniques to combine data on several endpoints from these three studies. We find convincing evidence of an effect of methylmercury exposure on full scale IQ in children aged 6 to 9 years.

Wednesday, March 16, 2005

### 2004 Charles L. Odoroff Memorial Lecture

**Bruce Levin, Ph.D.**

Professor and Chair, Department of Biostatistics

Columbia University Mailman School of Public Health

*A Generalization of the Levin-Robbins Procedure for Binomial Subset Selection and Recruitment Problems*

We introduce a family of sequential selection and recruitment procedures for the subset identification problem in binomial populations. We demonstrate the general validity of a simple formula providing a lower bound for the probability of correct identification in a version of the family without sequential elimination or recruitment. A similar theorem is conjectured to hold for the more efficient version which employs sequential elimination or recruitment.

Thursday, April 15 2004

### 2003 Charles L. Odoroff Memorial Lecture

**Danyu Lin, Dennis Gillings Professor**

University of North Carolina at Chapel Hill

*Selection and Assessment of Regression Models*

Residuals are informative about the adequacy of regression models. Conventional residual analysis based on the plots of individual residuals is highly subjective, while most numerical goodness-of-fit tests provide little information about the nature of model misspecification. In this talk, we present objective and informative strategies for model selection and assessment based on the cumulative sums of residuals over certain coordinates (e.g., covariates or fitted values) or some related aggregates of residuals (e.g., moving averages and kernel smoothers). The distributions of these stochastic processes under the assumed model can be approximated by the distributions of certain zero-mean Gaussian processes whose realizations can be easily generated by computer simulation. Each observed residual pattern can then be compared, both graphically and numerically, with a number of realizations from the null distribution. Such comparisons enable one to assess objectively whether a specific aspect of the model (e.g., the functional form of a covariate, the link function or the proportional hazards assumption) has been correctly specified. They also provide helpful hints on how to obtain an appropriate model. We apply this approach to a wide variety of statistical models and data structures, and provide illustrations with several clinical and epidemiologic studies. The methods presented in this talk will be featured in the next release of SAS.

Wednesday, May 7 2003

### 2002 Charles L. Odoroff Memorial Lecture

**Professor Mark Espeland**

Wake Forest University

*Heterogeneity of Response*

It is to be expected that responses to treatments, measurement techniques, and diseases may vary among individuals. Characterizing the underlying distributions of heterogeneous responses is often difficult in that it requires statistically removing the confounding influences of measurement error from the observed response distributions. This is an important problem, however, to the extent that there have been calls that drug approval be based on the nature of underlying response distributions, rather than merely average treatment effects. Some general approaches to the problem of estimating response distributions will be discussed. Three examples, from separate studies of postmenopausal hormone therapy, hypertension control, and carotid atherosclerosis, will be used as case studies. In each, multivariate hierarchical measurement error models were fitted with varying success to estimate characteristics of underlying response distributions. The goal of describing heterogeneous responses presents major challenges for study designs. Improved methodology and increased data sharing are needed.

Thursday, April 4 2002

### 2001 Charles L. Odoroff Memorial Lecture

**Professor Scott Zeger**

Department of Biostatistics, John Hopkins University

This talk will describe "SQUARE", a novel method for estimating the difference in means between two skewed distributions given one relatively smaller and one larger sample. This problem arises in assessing the medical costs of smoking.

We will give an overview of the statistical problem of determining smoking attributable expenditures which includes estimating the average cost of services for persons with smoking-caused diseases relative to otherwise similar persons without disease. We then introduce an estimator of this difference that relies on estimation of the ratio of the quantile functions for the two distributions. When the degrees of freedom to estimate this function is small, our estimator approximates the log-normal model mle; when they are large, it converges to the difference in sample means. We illustrate SQUARE with an analysis of the difference in medical costs with persons who suffer lung cancer and chronic obstructive pulmonary disease as compared to otherwise similar people who do not.

Thursday, April 12 2001

### 2000 Charles L. Odoroff Memorial Lecture

**Professor Norman Breslow**

Department of Biostatistics, University of Washington

*Descriptive Statistics and the GeNetics of Wilms Tumor: Lessons From the National Wilms Tumor Study Group*

Wilms tumor is an embryonal tumor of the kidney that affects approximately one child per 10,000 before the age of 15 years. It was almost universally fatal a century ago, but cure rates today approach 90% due largely to the use of modern combination chemotherapy. The National Wilms Tumor Study Group (NWTSG), founded in 1969, has been in the forefront of the recent advances.

Wilms tumor has served as a model of complexity for understanding of the genetic origins of cancer. It was initially believed to follow the two-hit mutational model proposed by Knudson (1972) on the basis of his statistical analysis of data for retinoblastoma, another childhood tumor involving a paired organ. Simple descriptive analyses of the large NWTSG database, however, suggested that the genetics of Wilms tumor were more complex. This suggestion was subsequently confirmed by laboratory studies demonstrating a role for genomic imprinting in the etiology of Wilms tumor, the limited number of cases associated with mutations in the first (and to date only) Wilms tumor gene to be cloned (WT1), and the linkage of familial cases to at least two other distinct loci. This talk will present several examples of how simple descriptive statistical analyses can challenge prevailing genetic models and suggest new avenues for laboratory investigation.

Thursday, March 23, 2000

### 1999 Charles L. Odoroff Memorial Lecture

**Professor David L. DeMets**

University of Wisconsin

*Surrogate End Points in Clinical Trials: Are We Being Misled?*

Phase 3 clinical trials, which evaluate the effect that new interventions have on the clinical outcomes of particular relevance to the patient(such as death, loss of vision, or other major symptomatic event), often require many participants to be followed for a long time. There has recently been great interest in using surrogate end points, such as tumor shrinkage or changes in cholesterol level, blood pressure, CD4 cell count, or other laboratory measures, to reduce the cost and duration of clinical trials. In theory, for a surrogate end point to be an effective substitute for the clinical outcome, effects of the intervention on the surrogate must reliably predict the overall effect on the clinical outcome. In practice, this requirement frequently fails. Among several explanations for this failure is the possibility that the disease process could affect the clinical outcome through several causal pathways that are not medated through the surrogate, with the intervention's effect on these pathways differing from its effect on the surrogate. Even more likely, the intervention might also affect the clinical outcome by unintended, unanticipated, and unrecognized mechanisms of action that operate independently of the disease process. Examples from several disease areas illustrate how surrogate end points have been misleading about the actual effects that treatments have on the health of patients. Surrogate end points can be useful in phase 2 screening trials for identifying whether a new intervention is biologically active and for guiding decisions about whether the intervention is promising enough to justify a large definitive trial with clinically meaningful outcomes. In definitive phase 3 trials, except for rare circumstances in which the validity of the surrogate end point has already been rigorously established, the primary end point should be the true clinical outcome.

Thursday, April 8, 1999

### 1998 Charles L. Odoroff Memorial Lecture

**Professor Alan Agresti**

University of Florida

*Small Sample Analysis of Categorical Data: Recent Advances and Continuing Controversies*

The Development of methods for "exact" small-sample analyses has been a major advance of the past decade in contingency table analysis. Exact methods guarantee that the size of a test is no greater than some prespecified level and that the coverage probability for a confidence interval is at least the nominal level. A variety of exact methods now exist, both of a conditional and unconditional nature. The great variability in results that can occur with different methods reflects complications due to discreteness. As discreteness increases, exact tests and confidence intervals tend to become overly conservative. We illustrate these issues by studying interval estimation of two basic parameters -- the proportion and the odds ratio. In each case, even for small samples one can argue that "large-sample" solutions are superior to exact ones for many purposes. There will always be an important niche for exact methods, but issues discussed here suggest that statisticians should perhaps reconsider how to evaluate inference procedures.

Thursday, April 16, 1998

### 1997 Charles L. Odoroff Memorial Lecture

**Professor John C. Gower**

The Open University

*Algebra, Geometry, and Statistical Graphics*

Thursday, April 24, 1997

### 1996 Charles L. Odoroff Memorial Lecture

**Dr. Mitchell Gail**

Biostatistics Branch, National Cancer Institute

*Statistics in Action*

This talk describes two of the most important developments in medical investigation: the adoption of randomization in clinical trials, and the use of statistical ideas, including the case-control design, to establish and investigate the association between smoking and lung cancer. I shall discuss these two developments and two statisticians who contributed enormously to their realization, Austin Bradford Hill and Jerome Cornfield.

Thursday, April 18, 1996

### 1995 Charles L. Odoroff Memorial Lecture

**Professor Nan Laird**

Department of Biostatistics, Harvard School of Public Health

*Handing Dropouts in Longitudinal Clinical Trials: Alternative Strategies for Intention-To-Treat Analyses*

Tuesday, April 25, 1995

### 1994 Charles L. Odoroff Memorial Lecture

**Professor Ross Prentice**

Division of Public Health Sciences, Fred Hutchinson Cancer Research Center

*A Low Fat Eating Pattern for the Prevention of Cancer and Other Diseases*

Tuesday, April 26, 1994

### 1993 Charles L. Odoroff Memorial Lecture

**Professor Joseph Fleiss**

Columbia University

*Clinical Trials Are Not Always Judged On Scientific Groups: The Example of the EC/IC Bypass Study*

Wednesday, March 3, 1993

### 1992 Charles L. Odoroff Memorial Lecture

**Professor Bradley Efron**

Department of Statistics, Stanford University

*Bootstrap Confidence Intervals*

Thursday, March 19, 1992

### 1991 Charles Odoroff Memorial Lecture

**Sir David Cox, F.R.S.**

Warden of Nuffield College, Oxford University

*Causality: A Review with Statistical and Epidemiological Implications*

A review of various definitions of causality, with special reference to epidemiological applications. Implications for empirical statistical analysis and for clinical trials.

Thursday, May 23, 1991

### 1990 Charles L. Odoroff Memorial Lecture

**Professor Frederick Mosteller**

School of Public Health, Harvard University

*Probabilistic Expressions as Quantified by Medical Professionals and Science Writers*

For 20 different studies, we tabulate numerical averages of opinions on quantitative meanings of 52 qualitative probabilistic expressions. Populations with differing occupations, mainly students, physicians, other medical workers, and science writers contributed. In spite of the variety of populations, format of question, instructions, and context, the variation of the averages for most of the expressions was modest.

The paper also reviews studies that show stability of meanings over 20 years, mild effects of translation into other languages, context, small order effects, and effects of scale for reporting on extreme values.

Wednesday, March 28, 1990

### 1989 Charles L. Odoroff Memorial Lecture

**Professor Paul Meier**

Department of Statistics, University of Chicago, Chicago, Illinois

*Trials of a Wonder Drug: Does Aspirin Prevent Heart Attacks?*

In the 1960's it was reported that patients who suffered heart attacks were seldom regular takers of aspirin. From that time forward many studies have been devoted to the question of whether there may be a possible therapeutic benefit of aspirin in preventing myocardial infarction. Although the benefit of aspirin prophylactically (i.e., in preventing a new infarction) remains unclear, it has at least been demonstrated that aspirin is beneficial therapeutically (i.e., in acute post-infarct survival). This example well illustrates the strengths and weaknesses of different styles of clinical evaluation, a topic central to the life and work of Charles Odoroff.

Wednesday, March 8, 1989