A Summer Adventure in San Francisco as an Insight Data Science Fellow
News Article by Binshuang Li, PhD Candidate in Biology
Data science had not been on my radar until about two years ago, when a program director at Insight Data Science visited UR. During the seminar, I immediately realized it could be a great fit for me. At that time, I had already started doing an online master degree in computer science while pursuing my PhD in Biology. I knew that an industrial job would be a better fit for me, yet I was not entirely convinced that I wanted to be a programmer. After the seminar, I did more research and found that data science is a fast-growing, in-demand field that has attracted a lot of people from academia. The Insight Data Science Fellow Program listed three fundamental skills necessary for a data scientist: big picture problem solving, strong quantitative abilities and experience with statistical analysis. And The Insight Data Program promised to bring in the extra missing piece: experience with industry data tools and techniques. I decided that I was going to apply to this program once I was in the final stage of my PhD. In November 2017, I participated in the URBEST program and was glad to find a new Data Science training pathway. URBEST Director Tracey Baas and URBEST Data Science Co-Leader Helene McMurray have been really helpful and introduced me to one of UR’s Insight Data Science alumna, Dr. Aslihan Ambeskovic, who is the other URBEST Data Science Co-Leader. Aslihan kindly shared her experience and gave me some great advice about applying to the program.
I originally thought the application would be a breeze. Yet the reality was a totally different story. Later I found out how competitive this program is! I first applied for early admission to the general data-science program in Silicon Valley for the 2018 summer session. There are several different programs, multiple locations and three program sessions per year. However, I quickly got rejected. I was a bit surprised because I thought I would at least get to the interview stage. Luckily, I did not give up completely. After doing more research, I reached out to Insight Data Science asking if I could still apply to other programs for the summer session, before the normal application deadline. After getting their confirmation, I further polished my application package. I participated in more data projects (Kaggle competitions make you stand out!), updated my GitHub repository, and had my roommate go over my essay with me. I decided to choose the health data-science program this time, thinking it might be a better fit for my background. I also applied for the Data Incubator, which is a similar program. Eventually, I was invited to interview for both programs.
The interview with Insight was pretty straightforward. First, the program director introduced himself and asking me to do the same. Next, he asked me to do a project demo to showcase my data analytics skills. I happened to have recently participated in a Data Analytics Competition with two friends just before the interview so I presented my project. He had thought I was going to talk about my PhD thesis work, probably many candidates do. Clearly, the program director had seen a lot of data projects, and he quickly understood my project and followed up with several questions. After the demo, there was some time for Q&A. Overall, the interview process was really pleasant and more like a conversation. Communication skills are critical for a data scientist, and I think that’s part of what they were looking for at the interview. The Data Incubator interview process was longer: candidates needed to work on and finish a data/coding challenge problem set before Data Incubator scientists would set up a Skype interview with you. I withdrew from the application after I got the offer from Insight.
Before leaving for the program
I was grateful that I had such an amazing PI, Jennifer Brisson, who is so supportive of my career choice. I understood that my absence could impact the research in the lab, so we carefully worked out the plan for experiments during my absence. Also, I convened with my thesis committee members to report the progress of my work and received their approval for my plan. Additionally, I was lucky enough to get support from the URBEST program to cover my housing while in San Francisco. It is definitely NOT a cheap city to live in!
The Insight program
Finally, I arrived at the Insight office in San Francisco. It looked really start-up-y They had recently moved to a new, bigger office. Our working area was made up of three giant tables with multiple chairs for fellows to move around in and group work. They provide free snacks and drinks, plus one free lunch onsite every week. The other fellows were PhD candidates like me or postdocs, and there was even an assistant professor in statistics from University of California Irvine in our cohort. Their backgrounds range from math, physics, chemistry, biomedical engineering to microbiology, neuroscience, and genetics. It was then that I learned that the admission rate to the program is about 3%, and I realized how strong every Insight Fellow has to be.
The pace of the program is really fast. The first four weeks, we would work on a demo project, and the next three weeks, we would present the project to multiple hiring companies. For the first week, we needed to identify a project and prepare a four-minute demo on Friday. The demo is really important because it is a great opportunity to present ourselves in person. I think this is what makes Insight different from other recruiting agencies. For our work, we could choose two different types of projects: a self-identified project or a consulting project identified by a company. Each type had its own pros and cons. One has more control over a self-identified project, but it also needs more work because it has to be interesting and relevant for an employer, and it needs to be do-able in a four-week time frame. The consulting projects have less issues with relevance because they come directly from an industry request. However, there is less flexibility to finding a solution, and one needs to be confident they can handle the technical challenge that is required. I did see a fellow initially chose a consulting project, grapple with the results, and switch to a self-identified project in the middle of the program. I could only imagine how stressful that was! I personally chose a consulting project with a Natural Language Processing (NLP) problem. I thought having some industrial experience would be invaluable for me, and it did indeed turn out to be really useful for my future interviews. In addition, I had never worked with an NLP project before, but I knew the area was really popular in industry and, therefore, I wanted to gain some experience in this area.
We also had companies visiting us at Insight. If it was a small startup company, the CEO, CTO or C-suite people would meet with us. Sometimes, if the company had previously hired Insight fellows, the alumni would come along to join the conversations. These visits were a great opportunity for us to have face-to-face conversation about their companies, what it was like to work there, and what they are looking for in job candidates. The program director even prepared some sample questions for us, in case we ran out of ideas of what to ask! Besides company visits, we had several Insight alumni visits. These alumni had very similar backgrounds to us and are now working at different companies as data scientists. It was really helpful to hear their stories about how they transitioned from academia into industry.
Starting the second week, we had one-on-one advising sessions with Insight alumni. They would identify their specialties [e.g. NLP, time-series, machine learning/deep learning (ML/DL), etc.], and we could pick the most relevant specialties that would be helpful to our demo project. The alumni I interacted with gave me several great ideas that I would have never thought of myself, due to my limited experience with NLP. After the fourth week, the Insight alumni started to do mock interviews with us, and I was really grateful for these alumni mentors. It was like having private tutors throughout the whole process.
All together, it was a really stressful yet really productive first four weeks. During that time, we were working on a demo project, had company visits, had alumni visits, participated in alumni mentoring, and had routine check-ins with the program directors the whole. We also participated in some workshops that focused on certain topics, such as VC funding for startups and how to best prepare demos. We even had Insight Data Science Program founder Jake Klamka coming to visit us and share his story of creating this program. I was amazed by how much I learned and accomplished in only four weeks. I built an NLP tool to automate quantitative information retrieval from clinical trials. It is really time-consuming for doctors to read through the selection criteria of clinical trials each time to find a match for a new patient. With the tool, all quantitative information from the text (e.g. body mass index, blood pressure, age, etc.) can be extracted into a database and then the search can be expedited significantly (seconds vs. hours). For my work, I specifically launched an interactive website to host the tool and built a processed database of clinical trials for type II diabetes. This would have been impossible for me to finish by myself in such a short time frame.
In the final three weeks, we started to visit various companies that had agreed to hire fellows from the Insight program. Most of the companies had visited us already. Each of us got to pick our five most desirable companies to do a demo with. New companies was also gradually added to the list because they could not visit us in the first four weeks. Since my specific Insight Data Science Program focused on the health data domain, the companies were mostly biotech, health care, and insurance-related companies. Examples included a startup company that focused on sequencing the microbiome from the soil, a company that focused on streamlining the CRISPR process into a specialized machine, and a medical device company that invented a digital pill that could measure medication effectiveness. We also got invited to submit our resumes to other non-health related companies like Facebook and Twitter directly.
By that time, we had already shortened our demo to approximately four minutes. It's quite challenging to summarize an in-depth project, make it relatable to a general audience, and pitch your ideas. In fact, the process is quite similar to UR’s three-minute thesis competition, and I highly recommend PhD students participate. It was also extremely valuable for us to bring our demos to the companies and experience their worksites. Seeing what the working environment was like and how people interacted with each other was very important because they could be our future colleagues! During the demos, we all learned what industry people really care about: what they like or they don’t like. For instance, I got a question about how fast my tool could provide answers. I answered about 3 seconds, thinking that was pretty fast. However, the person asking the question considered 3 seconds to be slow, but thought that speed might be acceptable for a MVP (Minimum Viable Product).
After week seven, the program slowly finished, but the interview process was just starting; the program actually suggests you stay eight more weeks for the interviews. We started to get callbacks from companies that we had done demos with, making the final weeks both exciting and nerve-wracking. In retrospect, having a interview good strategy is critical. I was lucky enough to get several callbacks simultaneously; however, I did not pace myself well enough to spend sufficient time to prepare for each interview. I did not do well on my initial two interviews because of a lack of time and preparation. Interviews for data scientist are more challenging than those for software engineers, in the sense that there are so many aspects to cover. Coding, stats, database (SQL), machine learning, business acumen, and behavioral questions can all be tested. Also, anything mentioned on your resume is all fair game. It can feel exhausting at times. Insight did prepare us for these grueling interviews through various approaches in the program. Aside from the alumni mentoring, the fellows also help each other out. One key consideration for the composition of each cohort is the complementary skills of all the fellows. By having fellows who had such diverse backgrounds, we could leverage all that diversity and all that knowledge. For example, the assistant professor in statistics gave us a lecture on survival analysis. A postdoctoral fellow from math gave us a full derivation for principal component analysis (PCA). I was able to share my knowledge of Big Data tools from my online master degree in computer science. We also did mock interviews with each other all the time, and all the fellows were exceptionally nice and friendly!
At week fifteen, about half of the fellows had already received one or more job offers. The other half (as of this writing) has also gradually received offers (including me!). The Insight program also provided nice tips about how to negotiate salary and compensation with employers and will support fellows when they go back on the market again; the network the program provides is invaluable. Since the founding of the Insight Data Science Program in 2012, thousands of Insight alumni are working in numerous companies and various fields today. All of these connections can help make you more visible in the job market. For instance, a referral can get your resume to the desk of a hiring manager or recruiter, which is a much more effective strategy than submitting applications online by yourself.
To sum up, I’m glad I had the opportunity to participate in the Data Science Fellows Program. It was a great experience. I gained a lot of knowledge and experience that is not offered in an academic setting. I am glad I got to meet new friends from such a diverse background. I’m also grateful for the URBEST program for providing support. I’ll be happy to answer any questions about the program for any URBEST trainee or UR student. You can reach me at firstname.lastname@example.org or through https://www.linkedin.com/in/binshuangli/
Tracey Baas |