PhD in Statistics (BCB concentration)
Other Available Graduate Programs
PhD in Statistics (traditional program)
Master of Arts in Statistics
Master of Science in Biostatistics
The Bioinformatics and Computational Biology (BCB) concentration is designed to educate the next generation of biostatisticians with the knowledge required to address critical scientific and public health questions, and in particular, equip them with the skills necessary to both develop and use quantitative and computational methodologies and tools to manage, analyze, and integrate massive amounts of complex biomedical data.
Students learn core statistical methods and obtain training in data analysis methodologies and computational skills and techniques necessary for handling “Big Data” in the biomedical and public health sciences. In addition to this training in core methods, the program also places great emphasis on cross-training to prepare students to work as part of interdisciplinary teams that require expertise in statistical data science: 1) training students with quantitative/computational science backgrounds to enhance their understanding of biological questions and biological interpretation; and 2) training students with biomedical science backgrounds to proficiently use bioinformatics and computational methods and tools to address scientific questions.
Formal course and examination requirements for students in the BCB concentration are essentially the same as those for students in the traditional program, with the main differences being in some required and elective courses related to bioinformatics and computational biology.
Beginning students should expect to spend all of their first year, most of their second year, and some of their third year taking formal courses. The balance of time is spent on reading and research. Students entering with advanced training in statistics, bioinformatics, or computational biology may transfer credits at the discretion of their advisor and in accordance with University policy.
In general, the PhD program requires a minimum of four years of study, with five years of study being more common (see Timeline for Degree Completion). Prior to completion of the PhD, most students have some publications underway, including some work related to their dissertation research, possibly other methodological work done in collaboration with other members of the faculty, and often some applied papers with scientific researchers in other fields.
Entering PhD students should have a strong background in mathematics, including three semesters of calculus (through multivariable calculus), a course in linear and/or matrix algebra, and a year of probability and mathematical statistics. Basic courses in computer science and/or biology are also required. A course in real analysis is encouraged; a course in statistical methods is also recommended.
Normally, doctoral students are initially considered MA candidates; this non-thesis degree can be completed in three semesters or, in some cases, in one calendar year. PhD studies consist of additional specialized courses, seminars, and supervised research leading to a dissertation. There is no foreign language requirement. Computer expertise is developed in the program.
Students are expected to spend a minimum of 40 months and a maximum of 66 months, not necessarily continuously, engaged in one or more of the following activities that enhances their education and skill sets as statisticians: teaching assistantship, research assistantship, participation on the statistical consulting rotation, and summer internships.
All MA/PhD students take a comprehensive (basic) examination at the beginning of the second year. PhD students take another written (advanced) examination at the beginning of the third year. Both examinations will cover material in the areas of probability, inference, data analysis, and bioinformatics and computational biology.
After beginning research on a dissertation topic, PhD students take an oral qualifying examination, consisting largely of a presentation of a thesis proposal to a faculty committee, the student's Thesis Committee. Upon completion of the dissertation, doctoral candidates present their work at a public lecture followed by an oral defense of the dissertation before the Thesis Committee.
Typical Program of Study
Year 1: Fall
- Probability Theory (4 credits)
- Statistical Inference I (4 credits)
- Biostatistical Methods I (4 credits)
- Introduction to Statistical Computing (4 credits)
Year 1: Spring
- Statistical Inference II (4 credits)
- Biostatistical Methods II (4 credits)
- Bayesian Inference (4 credits)
- Linear Models (4 credits)
Year 2: Fall
- High Dimensional Data Analysis (4 credits)
- Generalized Linear Models (4 credits)
- Advanced Bayesian Inference (4 credits) or Causal Inference (4 credits)
- Ethics in Research (1 credit)
- Seminar in Statistical Literature (1 credit)
- Supervised Teaching (2 credits)
Year 2: Spring
- Analysis of Longitudinal and Dependent Data (4 credits) or Survival Analysis (4 credits)
- Genomic Data Analysis (4 credits) or Introduction to Quantitative Biology (3 credits)
- Seminar in Statistical Literature (1 credit)
- Reading Course(s) at the PhD Level
Year 3+ Mostly reading and research, with some 400-level and 500-level courses.
- BST 487 Seminar in Statistical Literature (1 credit) is offered every semester. PhD students who enter the program after 2019 are required to register for at least four semesters. This course (1) provides students with experience in organizing, preparing, and delivering oral presentations, (2) introduces students to the process of searching the statistical literature, (3) enables students to acquire knowledge of a focused area of statistical research, and (4) introduces students to the research interests of members of the faculty.
- All PhD students are required to have at least four credits of supervised teaching and/or supervised consulting (BST 590, BST 592).
- Advanced topics courses listed as BST 511, 512, 550, or 570, for varying numbers of credits, are offered depending on interests of students and instructors. Recent examples include:
- Introduction to Spatial Data Analysis
- Missing Data
- Functional Data Analysis
- Statistical Analysis of Cell Mixtures
- Smoothing Methods
- ROC Curve Analysis
- The Bootstrap, the Jackknife, and Resampling Methods
- Model Selection and Validation
- Semiparametric Inference