Skip to main content
Explore URMC

SMD Logo

menu

Data Science

According to Forbes Magazine, “Data Scientist” is among the 10 highest paid job categories for 2015 with an estimated job growth for the sector of 15%.  Moreover, Glassdoor reports that not only are data scientists among the top 25 highest paying in-demand jobs, but those who work in the field have report the best working conditions and most job satisfaction of anyone in any job.  The NIH has developed a series of initiatives, the “Big Data to Knowledge (BD2K)” programs, to leverage the power of big data and informatics to understand biology and medicine.  Given this landscape, training and preparing a highly educated workforce to meet the demand for data scientists in biomedicine appears to be crucial and will provide trainees with myriad post-graduate options and opportunities. This pathway can be explored as a career or a skills pathway.

The Data Science Pathway holds three tracks that provide foundational skills and experiences for trainees interested in these paths (Genomics, Informatics and Statistics), and is directed by Dr. Helene McMurray, head of the Bioinformatics Consulting and Education Service of Edward G. Miner Library and Assistant Professor of Biomedical Genetics.  Through teaching and consulting, she provides others with the tools and information necessary to apply data science approaches to the scientific problems of interest to each student or laboratory group.  Her knowledge of these techniques comes from her experience as a researcher, studying the biology of cancer cells to identify novel points of vulnerability.  Dr. McMurray has helped develop several formal courses for graduate students related to biomedical data science, and runs a series of workshops on bioinformatics, biostatistics and data science through Miner Library.

Genomics Track: This track will offer trainees foundational training in genetics and genomics through formal courses, workshops and hands-on experiences that will enable understanding of and experience in careers in genomics. 

Informatics Track: This track will offer trainees foundational training in computer programming and data management through formal courses, workshops and hands-on experiences that will enable understanding of and experience in careers in informatics.

Statistics Track: This track will offer trainees foundational training in biostatistics and mathematics through formal courses, workshops and hands-on experiences that will enable understanding and experience in careers in computational biology. 

Explore the Data Science Pathway

Learn more about the day-to-day of Data Science work through informational interviews with local data scientists, such as the members of the Bioinformatics and Data Visualization Service, Center for Integrated Research Computing, Department of Biostatistics and Computational Biology or other members of the Rochester Informatics network. 

INFORMAL OPPORTUNITIES FOR LEARNING AND NETWORKING

Attend one of the ongoing interest group meetings to learn more and network locally, including the Transcriptomics and Integrative Genomics Research Interest Group (TIGR) meetings, Biostatistics and Computational Biology seminar series, CTSI Seminar series or the CIRC Symposium series.

Take a local Bioinformatics Workshops offered through Miner Library,

Learn a programming language through Code School or Codecademy.

Take an certificate program, such as those offered by Johns Hopkins University or Stanford University.

Apply for the Data Science Incubator program.

ATTEND A COURSE ON THE TOPIC – OR EVEN A LECTURE OR TWO.

Investigate the MS program in Data Science

Consider a Graduate Certificate of Advanced Study in Biomedical Data Science. 

Genomics: Intro to Quantitative Biology (IND419) with Hucky Land, PhD

This is a graduate-level survey course that introduces concepts for the analysis of high volume biological data in the context of important current biological questions. No previous computational experience is required. At the end of this course, students should have a deeper understanding of the computational tools involved in the analysis of high volume biological data, focusing on web-based resources but also introducing core approaches in bioinformatics. As an advanced-level course, we will emphasize critical thinking and reading of the primary literature to understand original experiments, rather than abstract facts and memorization. Students’ knowledge, understanding and ability to formulate new ideas will be evaluated through homework and discussions.

Informatics: Intro to Biomedical Informatics (PM494) with Tim Dye, PhD.

This course serves as an introduction to biomedical informatics, as applied in research and in clinical practice. It will provide a study of the nature of biomedical information and its capture, collection, storage, and use. Of particular interest in this course is the use of the electronic medical record (EMR) its use for research and its impact on health care delivery, the Internet and mobile computing, custom Health Care Information Systems, their development, selection and implementation, and the importance of the computing or informatics specialists in medicine and research and the various roles they can play, privacy, confidentiality and information security including health care regulatory and accreditation issues and the Health Insurance Portability and Accountability Act (HIPAA). The course will also introduce students to concepts of Biorepositories, Big Data, Data Science, and Health Care Analytics, particularly from the perspective of the informaticist responsible for managing data sources.

Statistics: Applied Statistics for the Biomedical Sciences (BST467) with Xueya Cai, PhD.

This is an introductory level biostatistics course designed for PhD students in the biomedical sciences. This course will cover the topics on probability and probability distributions, sampling distributions, statistical inferences from small and large samples, analysis of categorical data, analysis of variance, correlation and simple linear and non-linear regression analysis. All analytical topics will be illustrated using examples from biomedical sciences areas. Audits are not permitted in this class.