OK, so you want to become a data scientist. You probably don’t want to just change your job title. You probably want some new skills.
A new startup called Data Origami might be able to help you. In recent weeks, it has launched an initial series of screencasts on data science techniques using the Python programming language and associated libraries. More screencasts are on the way, and in the future, the focus could expand to data tools outside the Python realm. But however the site evolves, what will stay the same is the goal of helping more people rely less on gut-instinct decision-making.
“I think the most important thing is understanding uncertainty,” Cam Davidson-Pilon, Data Origami‘s founder, said in an interview — a video interview, naturally — with VentureBeat. “It’s not believing what your eyes see. Too often, people make inferences — not even doing statistics, but just, like, mental inferences that are wildly off.”
And that could change for developers willing to spend some time with Data Origami’s screencasts.
As data science becomes more trendy and interesting to companies big and small, at nonprofits, and at venture capital firms, demand for these people is going up. Educational options are proliferating, too.
You could go back to school. For a shorter span of time, you could join a months-long training program in a city like Berlin, New York, or San Francisco. Or you could take classes online through a massively open online course site like Coursera.
But before you can use those digital gadgets, you need to know the basics. And Data Origami can help there, as an alternative to or as a supplement of existing options.
Davidson-Pilon has crafted screencasts on Python libraries like Patsy, Bayesian A/B testing, and other topics. Lessons on Bayesian models and other subjects are on the way, with two new ones coming each month, he said.
Plenty of people have already started paying for the $9-a-month subscriptions to access all screencasts and applicable code and data since Davidson-Pilon launched Data Origami late last month and started promoting the site on online forums.
In the future, Data Origami could branch out and start offering video lessons in the widely used SQL query language for data analysis or even the open-source Spark engine and tools for processing lots of different kinds of data.
And perhaps Data Origami will broaden and add a cloud-based service for going through the lessons, said Davidson-Pilon, who was the main author of the 2013 e-book “Probabilistic Programming and Bayesian Methods for Hackers” and the inventor of the Python library for survival analysis called Lifelines.
For now, people can simply follow along with the screencasts by downloading corresponding IPython notebooks that contain the lessons.
And while people are signing up for Data Origami, Davidson-Pilon doesn’t have immediate plans to leave his current job — as the product analysis team lead at commerce company Shopify. That work has been interesting, but “right now I have essentially one client,” he said. Things could get more exciting, he said, as he starts teaching others some of the smarts he’s collected on his own.