Diving into Data Science
News Article By Emily Weber
Data Science is quickly emerging as a new field that many scientists at the University of Rochester are working in and exploring. Dr. Aslihan Petenkaya, a postdoctoral fellow in the Biostatistics and Computational Biology Department, is a current fellow for the seven-week intensive post-doctoral Insight Data Science Fellows Program. She found herself interested in data science towards the end of her PhD in Hucky Land’s laboratory, where she started studying if different metabolic characteristics of cancer cells could predict how they responded to certain enzymatic inhibitors. “Predicting if a cancer cells falls into a certain class requires application of supervised learning algorithms,” said Dr. Petenkaya. “I started studying different algorithms and learned more about how data science shapes and directs our lives in very visible ways and how widely applicable the tools used in data science are.”
Dr. Petenkaya learned about the Insight Data Science Fellow Program from a friend and filled out the online application. The Insight Data Program strives to train PhDs from academia in data science technology and partners with such companies such as Facebook, LinkedIn, NBC and Spotify. After a Skype interview, where she was asked to demo a small project, Dr. Petenkaya was accepted. To help make the internship possible, URBEST financially supported Dr. Petenkaya with housing assistance.
During the first couple week of the Insight program, applicants are asked to either work on a consulting or personal project. A consulting project involves solving a problem with a data set given to them. Those who choose to design a personal project collect their own data and build a predictive model and web application to answer their own data science question. Fellows also have the opportunity to pitch their ideas to companies and publish their applications online.
Dr. Petenkaya appreciated the opportunities and knowledge she gained from her time at Insight, “I really enjoyed extracting features from the data. During my project, I worked with Twitter data, which is mostly text. I used natural language processing techniques to look at the linguistic style between depressed and happy people. Feature extraction is one of the steps in data analysis which allows you to be very creative with the data.”
Another big advantage of being an Insight fellow is not only expanding data science skill sets but also networking with other scientists. “A big portion of the fellows are physics and engineering PhDs. Talking to them, I learned more than data science. Also, the program puts you in touch with all the fellows who went through this program so far, which amounts to a priceless network,” said Dr. Petenkaya.
Biomedical Data Science was included as one of the 3 new pathways for the URBEST program this year. Dr. Helene McMurray, the head of the Bioinformatics Consulting and Education Service of Edward G. Miner Library and Assistant Professor of Biomedical Genetics, directs this pathway. “Computational approaches give us the best chance of considering all of the data that we can now generate to build the most comprehensive models of biological function ever possible,” said Dr. Murray. “From a more practical standpoint, data science is important because it's everywhere these days and the field is still growing.”
For those interested in exploring data science, numerous opportunities exist. The Edward G. Miner Library hosts The Informatics and Genomics Research (TIGR) meetings, where experts from a variety of fields discuss relevant research and literature about novel techniques and approaches to big data problems. Students are also invited to attend the weekly seminar series hosted by the Clinical and Translational Science Institute (CTSI). Invited speakers include those who specialize in fields such as biostatistics and informatics.
Molly Jaynes, a graduate in Translational Biomedical Science, is one of the URBEST trainees exploring the new Data Science Pathway. Her long-term goal is working in Institutional Research, analyzing data that later informs decisions for a university. “My research involves analyzing data and I really enjoy the abstract way of thinking about it,” said Jaynes. “It’s a thrill to find information in messy data that you had a hunch was there.”
For those interested in exploring data science, numerous opportunities exist. Edward G. Miner Library hosts The Informatics and Genomics Research (TIGR) meetings, where experts from a variety of fields discuss relevant research and literature about novel techniques and approaches to big data problems. Students are also invited to attend the weekly seminar series hosted by the Clinical and Translational Science Institute (CTSI). Invited speakers include those who specialize in fields such as biostatistics and informatics.
Additionally, students new to the field can take classes formal classes, such as those introducing Quantitative Biology or Biomedical Informatics, or participate in Bioinformatic workshops offered by the Miner Library. Dr. McMurray also recommends seeking out individual researchers around the University of Rochester to get involved in a data science-related project. She says, “For those who want to throw themselves in the deep end of the lake, there's no substitute for working on a problem or project with a mentor or on their own. Lots of groups on campus and in the outside world use high-throughput and quantitative approaches to study myriad biological problems, and everyone seems to need more bioinformatics expertise these days.”
Tracey Baas |