By: Muhammad Mamdani
One of the biggest challenges in the data science world is recruiting highly qualified, competent data scientists. Some reports suggest the demand for data scientists is far outpacing supply. We believe this supply problem may be further constrained depending on the level of training required and the needs of the organization. At the very basic level, however, we define a competent data scientist as an individual who is sufficiently trained in the efficient management and analysis of large amounts of data. This involves the ability to efficiently clean, link, normalize, extract and structure data, as well as analyze data using advanced methods relevant to the position.
At LKS-CHART, we have three streams of data scientists:
- computer science with a focus on machine learning
- operations science with a focus on simulation modeling and optimization, and
- statistical science
Consequently, a competent data scientist in our operations science stream would have a deep understanding of data management as well as simulation modeling and optimization methods.
In our experience with job postings for data scientist positions in Toronto, Canada we often receive a considerable number of applications (e.g. 30-60 applications per position), but the vast majority of applicants have only casually been trained in data science. For example, most résumés we review list 1-3 online data science courses (e.g. Coursera, DataCamp). While this may be sufficient for some groups, our needs require the data scientist to have a truly deep appreciation of data science. Further, some applicants may be trained in data science programs of questionable quality and may not have a solid understanding of data science principles. Additionally, we include soft skills in our definition of competence.
So how do we evaluate an applicant for their depth of understanding of data science and their ‘social fit’?
Our hiring process for data scientists in healthcare leverages principles from an excellent guide written by Jeremy Stanley at Sailthru. We very briefly outline our five components below.
Finding Candidates: Job Description and Shoulder-Tapping
A first step in identifying suitable data science candidates is a clear articulation of what’s needed. For example, data scientists have varying skillsets and interests in particular analytical techniques and foci such as neural networks, natural language processing, vision learning, simulation modeling, and optimization. Further, the roles and responsibilities may vary depending on the needs of the employer. For example, an academic unit may be more focused on methods development whereas a hospital setting may be more focused on application.
Job descriptions can be posted on numerous recruiting sites such as www.monster.com and www.workopolis.com, there are numerous boutique recruitment firms that specialize in the recruitment of data scientists www.haasandriley.com and www.butchworks.com. Another strategy is to disseminate job postings through relevant university distribution lists (e.g. computer science department). However, we have been most successful through good old-fashioned shoulder-tapping. Often colleagues you can trust will point you in the right direction to identify highly qualified data scientists.
We typically receive 30-60 resumes for every position Data Scientist position we post. Unfortunately, the vast majority (typically 70-80%) do not appear to have the depth of understanding we require. Many applicants list an interest in data science and a limited number of online courses they may have taken, rather than a legitimate background in statistics and/or computer science. Rather than offering all candidates the opportunity to take a technical test, we often limit this phase of the interview process to candidates with some formal grounding in statistics and/or computer science or considerable demonstrated experience in working with data and analytics.
The technical interview consists of a data science problem the candidate must solve in a given amount of time. Typically, we invite approximately 10 candidates from the applicant pool to take this test and then assess this performance. We offer fixed blocks of time where we email the candidate the test, which identifies the dataset to be used and the parameters of the test. The candidate must use one of several programming languages (e.g. Python) and respond back via email with their code and results within a set timeframe. In our experience, 2-3 candidates out of the 10 that were invited will successfully complete this technical test.
Candidates who successfully completed the technical test are invited for a one-hour interview to ‘meet the team’. Dr. Arthur Slutsky at our research centre is often quoted for saying ‘recruit for talent, hire for fit’. Consequently, our in-person interviews usually involve 5-7 people with a key focus on assessing social fit. We do, however, also assess the candidates ‘on-the-spot’ thinking through a case study during the interview process. We feel it’s important to keep the interview ‘light and friendly’ so the candidate is at ease and we can see their ‘real’ personality as much as possible. While we often find 1-2 suitable candidates at the end of the in-person interview stage, we sometimes need to re-post the position if there are no suitable candidates. It can be a long and tiring process!
Common challenges with assessing social fit during the in-person interview include the very short length of time we spend with the candidate and the sometimes pressured environment in which the interview takes place. For these reasons, we request all candidates who we invite for an in-person interview to take the Jung personality test, which is both easy to administer and easy to grade.
The personality type of the candidate should ideally align with the group dynamics of the existing team and organizational culture. In our experience, there have been occasions where technically competent data scientists were simply not good social fits with our existing team. Consequently, they were not offered a position on our team.
Some Helpful Resources
We encourage you to check out some of the helpful resources we’ve included on our website at http://www.chartdatascience.ca, including:
- Data Scientist job description (sample)
- Technical test for LKS-CHART candidates (sample)
- LKS-CHART Interview Guide (sample)
- Jung Personality Test
- Jung Personality Test Interpretation Guide
Recruiting data scientists is among the most important processes for any data analytics group and should be done thoughtfully and diligently. It can be a long and painful process, but well worth this considerable effort in the end!