When I started to learn about data science and consider it as a career choice, there was a diagram that I came across regularly and still come across today, in articles and even text books aimed at introducing and educating the world about the “sexiest job of the 21st century.” First created by Drew Conway, it illustrates the three broad skill groups you need to be a data scientist.
Data science is a new career for the age of Big Data (whatever that means this week), but you can see that it’s at the intersection of qualities many people have been developing for years. As a graduate of the Science to Data Science (S2DS) summer school, I know people who have come to data science from a wide variety of backgrounds and found a new niche for themselves.
However, I believe there’s something missing from this picture — a vital skill that comes in many forms and needs constant practice and adaption to the situation at hand: communication.
This isn’t just a “soft” or “secondary” skill that’s nice to have. It’s a must-have for good data scientists.
As I mentioned, many data scientists are coming into commercial jobs directly from academic positions or after short intensive courses. You go from a situation where you are surrounded by peers who are also experts in your field, or who you can easily assume have a reasonable background and can keep up with you, to a situation where you might be the only expert in the room and expected to explain complex topics to those with little or no scientific background.
Not only that, but in an academic setting you can reliably predict the types of information people need or the questions they’ll ask: rationale, methods, results, conclusions etc. It’s a strict, linear way of thinking. As a new data scientist, or even a more experienced one, how are you supposed to predict what those strange creatures in sales or marketing might want to know? Even more importantly, how do you interact with external clients, whose logic and thought processes may not match your own? How do you manage up, across, and down?
My answer to this is not to predict or guess what people will want to know but to try to adapt my communication style to fit the person I’m talking to — and to listen to them and their needs. There is no point confusing someone with a long, detailed answer when all they want is a yes or no, or one number.
For example, over the last year at Qriously we’ve been developing our Audience Segments product, and from the beginning it’s been a big challenge in communication. Once we had our mathematical framework, how we communicated these ideas to the dev team who have built the computing capabilities was very different from how we talked with the sales team who must go out and sell it. Listening to their questions helped me understand their needs and focus our discussions as well as define the right metrics to measure our performance.
To be open and transparent with clients, we can’t just explain the potential for our products, we need to explain the possible pitfalls as well. We have started running “under-the-hood” sessions with some of our clients where they visit our offices and we talk about some of the more technical aspects of what we do. These are informal sessions, though, so people don’t want a math lecture or a discussion on coding practices. In this case, the challenge is finding the right balance between the formal, the detail, and the enjoyment.
If any of the communication had failed, our product would never have got off the ground. It has all made me appreciate how vital communication is as a data scientist. I can learn about as many algorithms or cool new tools as I want, but if I can’t explain why I might want to use them to anyone, then it’s a complete waste of my time and theirs.
Considering this, I propose a slight modification to Drew Conway’s original diagram. The original three skill sets still stand, but now we include a fourth skill that is critical in order to be a successful data scientist.