REDWOOD CITY, Calif. — Data scientists are hard to come by as it is. Now industry luminaries are raising the bar and talking up the need to find creative data scientists.
During a panel at VentureBeat’s 2013 DataBeat/Data Science Summit event yesterday, Peter Skomoroch, former principal data scientist at LinkedIn, encouraged companies to collect more data.
He gets excited when companies go beyond just monitoring data and build it into workflows, he said, so that “data begets more data.” Skomoroch envisions a world not too far in the future where balance sheets will track companies’ data assets.
But he and other panelists don’t just want more data to analyze. They discussed the importance of creativity as a key trait to look for in people who work with the data. That means relying on proven algorithms might not always cut it.
On the data science competition site Kaggle, some people who do well tend to “spend all their time being creative” as they comb through and pull ideas out of the data they’re given, said Jeremy Howard, Kaggle’s former president and now a data science faculty member at Singularity University.
Howard likes to just dive into data and start getting hunches about it, without knowing about the industry the data comes from and other context that others would find valuable.
“That way, there’s no blinders,” he said. It might come across as a contrarian view, but Howard thinks his approach is one reason he did well in Kaggle competitions.
Data scientists should be willing to get intimate with the data that’s available to them, said Monica Rogati, vice president of data at Jawbone.
“People are not really as informed about what’s going on at the individual data point level, and you really need to get your hands dirty and look at those individual data points,” she said. “There’s really no substitute for that.”
At Jawbone, Rogati said each applicant for data science jobs at Jawbone gets three hours to make sense of mixed-up company data sets. The test can reveal if candidates possess “applied skills,” she said, not just statistical know-how.
But sometimes data scientists don’t have all the answers, and being open to ideas from others can be useful. A good example came from Pete Warden, founder and chief executive of Jetpac, a company with a new application that can detect smiles in pictures and determine out how happy people are in a given place. Warden mentioned a surprising insight that people following standard operating procedure probably wouldn’t have discovered. Detroit might well have been among the top 10 cities for smiling “because of the meth,” Warden said.
“We actually spoke to a local TV reporter on that, and she said it’s because of the meth,” he said, promoting laughter in the crowd.
Surely Warden can go back to the office and run some tests to see if there’s a correlation worth talking about to test the theory.
Rogati said she thinks it’s interesting that sometimes in academic work, unusual hypotheses result from mere accidents. It could be that accidents have the power to give companies brilliant insights, too. So data scientists would be wise to appreciate accidents and consider ideas they hadn’t initially considered.
Data scientists also ought to have the drive to find out how things work. Skomoroch recalled an exercise he did in a chemistry class in grade school involving a box through which you could push marbles. Figuring out what was inside, and what was making a marble come out of a box where it might not be expected to, is the sort of skill that can get overlooked when talking about desirable traits for data scientists, Skomoroch said.
Finally, data scientists shouldn’t forget about the power of their eyes. “My favorite algorithm is the vision, because it’s just so powerful, and, believe it or not, it’s underestimated,” Rogati said.
It sounds like these data scientists are convinced that data and technology to process and store it are really just a foundation. As more companies clear space and shell out for data scientists, competitive advantages might just stem from the people, not the machines.
The audio problem: Learn how new cloud-based API solutions are solving imperfect, frustrating audio in video conferences. Access here