Prominent data scientists Hilary Mason and DJ Patil know a good bit about the trendy practice that combines statistics, programming, and communication. In an ebook out this week, the two spell out some of the common elements of the most helpful technologies for actually doing the work of a data scientist.
For those seeking to expand their capabilities and add “data scientist” to their LinkedIn profiles, the information, contained in Data Driven: Creating a Data Culture from O’Reilly Media, is worth a look. It should also be useful for entrepreneurs building technology specifically directed at data scientists.
In their book, Mason and Patil say the best data science tools are:
- Powerful. Specifically, they ought to be programming languages, not just dashboards.
- Easy to use. That means people should have no trouble learning the ropes. And educational materials should be widely available, too.
- Supportive of teamwork. They should make it easy for several people to analyze data together in a group. The ability to reproduce results is key here.
- Community. Lots of people should already have rallied behind the tool and be actively using it. That’s certainly the case with some open-source projects.
So what might meet these standards? Well, the R and Python programming languages come to mind, for starters.
Aside from the commentary about tools, Patil and Mason’s new book also includes discussion about the best way to organize data science teams within organizations as well as important questions for data scientists to ask.
The authors know what they are talking about. Patil, formerly of LinkedIn and now at Salesforce, is credited with co-coining the very term ‘data scientist.’ Mason, for her part, spent nearly four years as chief scientist at link-shortening company Bitly and was a part-time data scientist in residence at Accel Partners.