Data scientists are the rare alchemists of the data-deluged digital age.
They extract value out of the massive quantities of meaningless data that we generate every day, and competition for their skills is fierce. Job postings for data scientists ballooned by more than 15,000 percent between 2011 and 2012, and entry-level salaries start at $110,00 to $120,000.
Rather than choosing the comfort and security of a well-paid position at a large company, some of these coveted technicians choose to forge something of their own. A number of the most successful data startups have data scientists at their helm.
VentureBeat talked with a series of data scientists who went on to become founders and CEOs to learn more about how their background influences their leadership, product, and business strategies — and what it takes to succeed in an increasingly competitive, data-driven world.
All these data scientists-turned-CEOs put a heavy emphasis on data in their own business. They make data a core part of their strategy, operations, and decision-making process. Ultimately, data is only as valuable as what you do with it, and as self-described “data nerds,” these CEOs have an edge.
Editor’s note: Our upcoming DataBeat/Data Science Summit, tomorrow and Thurdsay in Redwood City, will focus on the most compelling opportunities for businesses in the area of data science and data analytics. There are only a few seats left, so be sure to register now!
Shashi Upadhyay has a Ph.D in physics from Cornell. He analyzed humongous datasets as part of his doctoral research. After graduation, many of his classmates and colleagues took jobs on Wall Street, where they were well compensated for their knowledge of math and statistics.
Upadhyay accepted a position with McKinsey Consulting (a global management consulting firm), where he spent six and a half years advising on sales and marketing problems before founding Lattice Engines.
“As the world has gone digital, data volume has exploded, and retailers tend to have humongous amounts of data,” Upadhyay said in an interview. “I realized there was an opportunity to connect the dots between my experiences. It is hard for most companies to put together a data-science team and compete in the war for talent, and I thought instead they would look for automated solutions.”
Lattice Engines bills itself as “big data for big sales.” Its platform analyzes data and delivers real-time reports with specific data to sales representatives, who can use the information to generate leads and close deals. The engine uses predictive analytics to help sales people anticipate their customers’ behavior.
Upadhyay said that founders and execs of data companies must be the “masters” of three domains: They must have unique subject matter expertise, an understanding of machine learning, and the capability to build systems that can scale. And it’s rare to find a leader with a grasp of all three.
“If you spend all your life analyzing data, like I have, certain things become muscle memory,” he said. “You know what is important to the end users, what the problems are, what is doable, and how long something will take. What makes us differentiates us is we have all three of those pieces.”
Upadhyay’s data science background is also important for recruiting data scientists to the Lattice team and ensuring they are productive and happy employees.
“Other companies make the mistake of bringing in data scientists and treating them like developers, but they are not the same,” he said. “Data scientists care about having an impact on the business, but companies systematically underinvest in training them in the domain and forming a linkage with other parts of the business.”
Thomas Thurston is something of a renaissance man when it comes to data science. He has an MBA and a law degree and is a member of Harvard Business School’s Forum for Growth and Innovation.
Thurston spent a stint working at Intel Capital, serves as the chief technology officer and fund manager of the Ironstone Group (a venture firm that uses data science to make investments), and is the founder and CEO of Growth Science, which uses data to predict if businesses will survive or fail.
“I think of data science as a way of thinking about the world in terms of hypotheses, testing, confidence, and error margins,” Thurston told VentureBeat. “A background in data science tends to help CEOs ask better questions and get better feedback, because it brings conversations down to a level of reality and practicality. Facts, data, and probabilities can have a way of removing the ego, politics, and hand-waving from a conversation.”
Thurston said that he favors hard data over more intuitive considerations. Like Upadhyay, he is a proponent of challenging all ideas until they are backed up with data and not taking anything for granted.
“It’s not that I don’t value intuition or more ‘soft’ inputs – sometimes they’re so important they can override everything else,” he said. “It’s just all too often in data science that you see intuition, anecdote, and feeling get turned on its head by actual data. Everyone thought the world was flat. It looked that way. It felt that way. It was intuitive. It was also dead wrong. I find it corrective to try to keep this in mind. Like it or not, I can be wrong at any moment, so I must be willing to adapt.”
Thurston said his favorite “data science moments” are when he learns something that flies in the face of conventional wisdom, and that these can become significant commercial advantages.
However data science and the advantages it brings have yet to make their way into more mainstream businesses. Both Thurston and Upadhyay expect that the evolution of this field will involve making data science more accessible to smaller, less tech savvy businesses.
Gurjeet Singh likes to spend his free time discovering and analyzing datasets and building multilegged robots. But as a founder and the CEO of Ayasdi, he finds much of his time goes toward using big data to address complex global socio-economic issues.
“Data needs to be the centerpiece of every company’s strategy– it is the single largest, general, competitive differentiator,” he said. “Data scientists turn data into knowledge. They are in high demand, but we need more data scientists than ever before and are producing fewer. To fix this, we need to amplify data scientists to do more with less and empower more people to become data scientists.”
Singh has a Ph.D in computational mathematics from Stanford. That’s where he met Ayasdi cofounder Gunnar Carlsson, who was researching how to use “topological data analysis” to solve social and economic problems.
Ayasdi arose out of a decade of DARPA-funded research. It has clients as diverse as oil and gas companies, which use data to optimize their drilling approach, and medical researchers, which use it to identify the genetic predispositions of many diseases.
“Ayasdi builds products that help our customers discover and monetize knowledge from data,” he told VentureBeat. “As such, being a data scientist myself helps in product direction because I can feel the end user’s pain. Also we use our own software to analyze our operations and make better decisions. In terms of leadership style, being a data scientist definitely forces us to measure everything and to make decisions based on data.”
Data has treated Brad Peters well. Before founding business intelligence startup Birst, he led analytics at Siebel Systems, which Oracle acquired for $5.8 billion in 2005. Peters saw an opportunity to build better business analytics software from the ground up and founded Birst later that year.
Birst powers is a “business intelligence” platform that brings together data from multiple sources. It presents data in reports, charts, and dashboards so people without tech backgrounds can make better use of analytical tools.
“There’s an overemphasis on the data scientist — the pure data scientist who knows lots of statistics, learning technologies, and algorithms has become a mascot,” Peters said. “But the more techy a data scientist is, the less connected they are with the business side of things.”
Peters echoed what Singh said about the importance data should and is playing in business strategy. He said that data can play a role at all stages of the businesses, and part of what Birst does is present a cohesive picture of the entire business, even “non-techy” areas.
He said businesses that don’t use data are dying because they only see “insights” once they are in the “rearview mirror.” Like his fellow data scientist/CEOs, he likes to work on personal data science projects in his spare time to better understand his customers and keep his skills sharp.
Not all the data scientist-led startups are building big data applications, products, and services. Sebastian Thrun, who led the integration of big data into robotics, is the founder of ed-tech startup Udacity.
“Bringing data science to the field of education is extremely exciting, because education as a field has enjoyed very little of it in the past,” Thrun told VentureBeat. “Professors rarely measure the effectiveness of their teaching at a fine-grained level, beyond student evaluations. Through Udacity, this is now possible.”
Thrun has had an illustrious career. He has a Ph.D in computer science and statistics and started Carnegie Mellon’s masters program in knowledge discovery, which bridges the statistics and computer science departments. This program evolved into America’s first Ph.D program in machine learning.
He went on to become a professor at Stanford, served as director of the Stanford Artificial Intelligence Lab, and then worked at Google, where he founded Google X.
He’s now bringing this expertise to bear in an effort to improve higher education and adapt it to the modern era.
Udacity offers free online, project-based courses taught by professors and industry experts. The company’s mission is to make education more accessible and affordable for students and to encourage lifelong active learning. Technology and data science is a core part of the mission
“We do AB tests in which we randomly assign students to one version of our classes, and others to another,” Thrun said. “Within a day we can measure the difference. Such immediate data and feedback never really existed in traditional classrooms, and it is not in the DNA of many professors. Udacity is big data changing higher education.”
Udacity also offers data-science courses designed by experts from Cloudera, Facebook, and MongoDB, and it plans to offer more in the coming months. Thrun said that data science is a critical part of any business and that courses like these can help managers and engineers fill the need for those skills.
Like the other CEOs we interviewed, Thrun said that his background means Udacity places a heavy emphasis on data for its day-to-day operations.
“I’m very open to input and additional data from anyone at the company — even qualitative data is data,” Thrun said. “I believe there are some aspects where pure data will not provide the perfect answer and intuition is more important, especially when it comes to innovation. The advantage lies with those who can combine creativity and intuition with the data at hand.”
It’s not [just] size that matters
Sexy or not, data is now seen as this powerful force that can predict the future . It can provide unprecedented insight into what formerly relied on gut feelings and luck, and the data-obsessed mindset is spreading far beyond the business world.
Health trackers and apps like FitBit, MapMyFitness, and MyNetDiary give people data on their physical activity and calorie intake to help achieve fitness goals and lose weight. Home devices and apps like the Nest “smart” thermostat use data to keep energy costs down. DigitalGlobe collects data to create a “heat map” of conflict in Africa. Data can even help people find their soulmate and predict sports and election outcomes.
However, as all the CEOs were careful to say, data alone is not enough. Human inspiration and input still has an important role, especially when it comes to innovation. It’s striking the right balance that is key.