“Big data” isn’t just an enterprise trend. It’s a technological innovation that is already making a difference in your life.
Police are mixing crime data and sociological information to anticipate incidences of crime. A small cadre of scientists in Silicon Valley is harnessing genetics data to detect early signs of disease. For business owners and harried IT executives, it’s easy feel overwhelmed with the flood of so-called big data options on the market. That is, if you buy into the trend at all.
“Big Data” is one of the main themes of CloudBeat 2012, VentureBeat’s upcoming conference highlighting real cases of revolutionary cloud adoption. We’ll explore organizations’ current issues with huge datasets as well as the many solutions that vendors provide to these problems. Confirmed participants include Metamarkets, Cloudera, and Qubole. CloudBeat happens November 28-29 in Redwood City, Calif. Register today!
We believe it’s time to cut through the hype and show you some cool companies that use big data to further research in the fields of healthcare, law, government, and education. To assemble our inaugural list of 10 standout companies, we spoke with investors, analysts, and experts.
We narrowed it down to some ground-breaking favorites who helped define the field. You’ll notice that the companies cover a lot of ground — we anointed a leader in every category, such as data science, business intelligence, data visualization and analytics. We also threw a few impressive newcomers into the mix as well.
Meet our inaugural list of big data companies that are exploiting data in ways that you wouldn’t expect.
Above: Michael Driscoll, Metamarkets’ CEO.
It’s been one heck of a year for Metamarkets, a startup that analyses things like tweets, payments, and check-ins for online publishers and web companies to better answer questions like: “Why are customers canceling their memberships?” or “How are users moving through the site?”In the spring, investors clamored to pour money into the San Francisco-based company, which currently has about $23 million in the bank. Aside from being freshly minted with venture funding, here’s why Metamarkets makes our shortlist.
It took guts to down an offer from Twitter: Michael Driscoll, who was Metamarkets’ chief technology officer at the time, wouldn’t back down from his vision to build a bigger company.
It has some cool customers: The technology sees use from digital media companies like the Financial Times and AOL, who need to get better insights on the value of their ad inventory.
“At the customer level, we’ve learned to paint our product in more emotional and less technological terms. Ultimately, big data and its technologies are not the story, it’s about helping people do their job better,” says Driscoll, who is now Metamarkets’ chief executive.
It has one straight-talkin’ CEO: Driscoll told me that we need to reexamine all the hoopla around Hadoop, the open-source computing software that is leveraged by companies like Twitter to store terrabytes of tweets per day: “It’s a service, not a solution,” he said. Driscoll told me he looks up to Aaron Levie, the 20-something CEO of cloud storage startup Box. He told me we need more founders who are willing to bring a natural-born salesman’s brand of personality and smarts to enterprise.
The technology is designed to be easy to use: Metamarkets provides a packaged Software-as-a-Service application so customers can get up and running in a few days. It scales as customer data volumes grow.
Bonus: Don’t miss Driscoll’s appearance at VentureBeat’s upcoming CloudBeat conference.
Above: Christian Chabot, CEO of Tableau
Tableau is a data visualization startup that appears on nearly every list of hot big data startups, but it hasn’t raised a funding round in four years. It is one of the small cadre of companies that are responsible for putting data visualization on the map. Here’s why Tableau made our shortlist.
It doesn’t need your VC cash: The startup’s last funding round in 2008 was for $10 million. It hasn’t taken a dime since then, and it’s on the road to profitability.
It’s changing the way we consume digital media: Tableau is a favorite for media folk experimenting with new ways to use data in their stories. Tableau offers a tool that lets anyone create gorgeous charts and maps for free, no programming expertise required. Reporters at the Wall Street Journal and the Huffington Post use it.
It’s putting data in the hands of people who need it: Big data can be useful for everybody in many ways — it’s not just a money-making tool for businesses. By putting this tool into the hands of a nonprofit like government watchdog California Common Sense (a current customer), Tableau’s software is analyzing volumes of political data to expose fraud and inconsistencies so voters can make informed, data-driven decisions before they vote.
Eat your heart out, Google: “Our mission, to help people everywhere see and understand data, isn’t all that different from Google’s,” Christian Chabot, the company’s cofounder and chief executive, told VentureBeat.
Hadoop for the Enterprise/Cloudera
Above: Mike Olson, Cloudera’s CEO.
Size does matter. When it comes to storing and processing gargantuan volumes of data, NOSQL can play well. HBase is Hadoop’s NoSQL database — it is the most talked-about in big data circles. (Fun fact: Hadoop is an open-source project that was invented by Doug Cutting and is named after the stuffed toy elephant of his son). Hadoop is still reaching maturity, but a company called Cloudera has emerged to lead the enterprise push, basing its product on Hadoop technology. Here’s why Cloudera made our list:
It has a rock-star founder: Jeff Hammerbacher, the company’s chief scientist, regularly appears in lists of the most promising young entrepreneurs. As a 23-year-old math genius, Hammerbacher arrived at Facebook when the company was still in its infancy. At the social network, he used Hadoop to reveal insights about user behavior. He would later start Cloudera to bring the benefits of this technology to the enterprise. Early advisers at Cloudera include the founders of the Hadoop project, Doug Cutting and Mike Cafarella.
It is using big data to improve your health: Expect Cloudera to drive more innovation in health care, an industry that could vastly benefit from data-driven insights. Hammerbacher has taken an active interest and is an adviser to health-care accelerator Rock Health. In addition, Mike Olson, Cloudera’s CEO, told me that one of his favorite customers is Explorys Medical, which collects vast volumes of patient information in a database and reveals new insights about treatment, quality of care, and which medical tests are most effective. “This is a serious big data problem with lots of information in great variety, prescriptions, images, doctor’s notes, literature about diseases, and so on,” Olson said. “You could share that information among all the doctors to drive quality of care.” Read more about how the genome entrepreneurs that are using your data to elongate the average person’s lifespan.
It is loaded: Did I mention that this Bay Area-based startup has $76 million?
Upcoming: Cloudera’s Dr. Amr Adwalla is appearing at CloudBeat, VentureBeat’s cloud conference.
Big Data Analytics/ParAccel
Above: ParAccel’s CEO, Chuck Berger
In data circles, ParAccel has a reputation as the lone wolf, although observers are convinced that its time on the singles circuit won’t be for long. For now, it’s pushing innovation at a startup’s pace. It burst on the scene last year after putting its analytic database into the hands of U.S. law enforcement agencies. Using data from ankle cuffs and other sources, it tracked the behavior of 15,000 ex-cons — and alerted officials about a potential crime. Here’s why ParAccel made our list:
It’s proud to be scrappy: While its major competitors (Vertica, Netezza and so on) have been snapped up by the likes of HP and IBM, ParAccel says it intends to remain independent. According to the company, the value of being small is that you can offer flexible pricing on a deal-by-deal basis.
It’s a “pre-crime” pioneer: Last year, ParAccel partnered with SecureAlert to calibrate data from the ankle cuffs of recently released criminal offenders. Using ParAccel’s analytics tool, SecureAlert was able to identify patterns of suspicious behavior and alert the authorities. It’s highly reminiscent of the film Minority Report’s precogs: In fact, the company told Bloomberg that this is real-life “pre-crime detection.”
ParAccel disputes the Hadoop myth: ParAccel’s CEO Chuck Berger said that too many startups are working with a flawed logic: “big data = unstructured data = Hadoop.” He explained, “There is no question that there is an explosion in unstructured and semi-structured data, but the growth in structured data is exponential as well.”
For Berger, Larry Ellison is a hero: Berger told me that he’s been fortunate enough to work with Steve Jobs at Apple, but he’s equally inspired by Ellison (Oracle’s CEO) because of his ability to crack the “complex database” problem. Berger said of Ellison: “He created a great company from the ground up, as many of the principles we now accept as ‘table stakes’ were being developed and successfully competed against giants like IBM from the start.”
Above: Jeff Boehm, VP, Global Product Marketing
QlikTech is the provider of Qlikview, a self-service business intelligence tool that you can use in a wide range of fields, such as scientific research or art. It was founded in Sweden in the late 90s, survived the dotcom bust, and held its initial public offering in 2010. Today, the company has 26,000 customers and is estimated to be worth over $2 billion. What’s relevant and cool about QlivView today?
It’s working with Google to put big data software into the hands of developers: With Google’s recent launch of Google BigQuery, developers can gain insights from massive amounts of data without any hardware or software. To help developers with the analytics portion, QlikTech stepped up to develop a dashboard that visualizes and crunches millions of rows of data.
Too many startups are focused on the “if you build it, they will come approach”: “The trouble is that working with big data requires a specialized skill set and level of technological sophistication that ordinary business users don’t have,” Jeff Boehm, a vice president at QlikTech, told me. The company’s mission is to help nontechnical types use data in their ordinary working lives.
Its CEO is the second Swede to ever “push the button” on a NASDAQ IPO: CEO Lars Björk CEO told Wired that it was one of those moments where he had to pinch himself. “This little company from Sweden was now all over billboards in Times Square — a pretty phenomenal experience.” He also took the opportunity to advise young founders to stay humble: “Don’t be a Mr. Know it all.”
Kaggle wants to “make data science a sport” and hosts competitions that challenge the world’s best researchers and statisticians. The idea is that the brightest minds can work together and/or compete against each other to produce complex algorithms and sophisticated solutions that use “big data” to further human understanding. The incentive? “Fame, fortune, or fun.” Here’s why Kaggle made our list.
Its findings will blow your mind: While most enterprise companies are afraid to move beyond their jargon-filled safe haven, Kaggle is not afraid to push buttons. Sponsored by the Hewlett Foundation, the essay-scoring competition revealed that an algorithm is no less reliable at scoring essays than the average human grader. This caught the attention of a reporter from the New York Times, who wrote a highly critical response.
The competitions are proving that algorithms are smarter than you ever imagined: In another competition hosted on Kaggle (sponsored by the Online Privacy Foundation), researchers found that you can use Twitter to detect signs of psychopathy. Most recently, the company helped Merck predict the toxicity of chemical compounds based on their molecular structure.
Its founders are whip-smart but personable: Twenty-eight-year-old founder and CEO Anthony Goldbloom (pictured) can always be counted on for a dose of refreshing honesty. Unlike most founders, he admits that raising funding “was one of the hardest things I’ve ever done,” and says that it’s still early days for big data. The greatest challenge, he explained, is “the market’s level of sophistication in thinking about data despite all the buzz.”
Above: GoodData CEO, Roman Stanek
From its early days, GoodData made the promise to its customers that it would help them extract dollars from their data. The company is interesting as it markets to business users (sales, marketing, business development) and bypasses IT executives. It was a bold move, one that appears to be paying off — the company recently raised $25 million led by Andreessen Horowitz for its third funding round.
It’s got crazy momentum: The company has more than 6,000 customers, which include Groupon, Zendesk, and Mint.com. In the second quarter of this year, bookings grew by 280 percent on a year-over-year basis. Last year, CEO Roman Stanek told me the company grew its revenues by a mammoth 600 percent.
It deliberately circumvents IT: “The only way that innovation gets into the enterprise is through business users,” said Stanek in a phone interview with VentureBeat. The cloud-based analytics service brings operational dashboards, metrics and performance reports, data storage, analytics, and collaboration tools to business users.
It’s tough: In 2007, the startup tried to raise money the day Lehman Brothers collapsed. “I like to think that tough times breed tough companies,” Stanek said.
Above: Josh McFarland, TellApart’s CEO
This company made our list not just because it bought a shipping container from Oakland and commissioned Apex, a famous graffiti artist, to scrawl “big data” across it (see the top image in this story). TellApart is the brainchild of ex-Googlers Josh McFarland and Mark Ayzenshtat. They are working with e-commerce companies to help them increase revenue by analyzing and targeting potential buyers based on their browsing behavior. It has $17.75 million in venture capital funding.
TellApart is making real revenue for its customers: “We build hardcore systems that tackle big data challenges for the likes of Nordstrom and Bed Bath & Beyond, and for that, we’re rewarded handsomely,” McFarland, the company’s CEO, told VentureBeat. It is also working with smaller or niche online retailers like eBags and Bellacor.
TellApart wants us to embrace big data: TellApart says that human-driven business rules are no longer sufficient. “Creating self-training algorithmic systems is the only way we’ll make use of it [data],” said McFarland.
It’s making those annoying Facebook ads more relevant: TellApart found that shoppers were 100 times more likely to click through a Facebook ad and buy a product from Shoebuy (one of the company’s clients) if they’ve looked at a similar item recently. On the Facebook Exchange, TellApart can match the exact pair of Vans you were interested in to the ad you see on Facebook. For Shoebuy, 12 percent of users will click through — which has dramatically increased the company’s bottom line.
The founders left with Google boss Larry Page’s blessing: McFarland told me he pitched an early version of the idea to Page, his former boss, during a flight. He said, “That’s a good idea. Google probably could not pull it off.”
Social Media Data/DataSift
Above: Rob Bailey, DataSift’s CEO.
The explosion of tweets and “likes” on social networking sites like Facebook and Twitter is valuable for company brands. DataSift, a cloud company, has $15 million in venture capital funding to do the heavy-lifting; its technology can gulp down and analyze data across the social web. This helps companies get a handle on public opinion on the breaking news and respond with targeted marketing messages.
It has a coveted licensing agreement with Twitter: Plenty of social media monitoring companies are listening to conversations on Twitter and gathering insights. With Twitter on board, DataSift is one of the few that can analyze historical tweets. Hypothetically, the company could compare how the public responded to news and events during the 2008 Summer Olympics with this summer’s games held in London.
Its founders pulled the plug on TweetMeme: I have high hopes for DataSift if it can live up to the success of TweetMeme, the site that was developed by DataSift’s technical lead, Nick Halstead. But it took nerve to shut down such a popular social media tool. According to DataSift’s founders, they pulled the plug because DataSift is already showing so much promise and has amassed over 10,000 users. TweetMeme was a favorite with developers, who mourned the loss when the news broke last month.
CEO Rob Bailey has this to say about the big data mega-trend: “I call bullshit on the term ‘big data.’ First of all, the big data problem has been around since before there were computers. Too many companies focus on managing big volumes of big data without focusing on end solutions and providing answers for customers,” Bailey said.
Above: Chris Neumann, cofounder of DataHero
Datahero has raised a $1 million in funding for its mission to “democratize data” by making it easy for anyone to understand and visualize it. Lots of companies use this term, but few have found a way to take the expert out of the equation. Founder Chris Neumann is one of a half-dozen early employees at Aster Data that left to start something new. The team has found ways to stand out: They showed initiative to raise seed funding, and we applaud this effort to hire engineers.
Lots of talent: Neumann was an early engineer at Aster Data Systems. He teamed up with chief product officer Jeff Zabel, formerly of BMW, where he was responsible for designing user interfaces for incorporating third-party apps like Pandora.
The founders will go to extreme lengths to impress potential investors: Neumann told me that during an introductory Skype call with the Foundry Group’s Brad Feld, they received a data file. To test the speed of the algorithm, Feld asked them to create a visualization of his fitness data on the fly. Shortly after, Feld cut the founders their first check.
It is taking on Excel: Good news for junior-level analysts everywhere: Spending hours hunched over spreadsheets may one day be a thing of the past. “This may sound strange, but the competitor we’re most focused on is actually Microsoft Excel,” said Neumann, the company’s founder. According to Neumann, it’s still the “go-to product,” but it’s not equipped to make meaningful sense of the data.
It aims to bring design to the enterprise: According to Neumann, his cofounder “lives and breathes user-centric design” — rare for an enterprise-focused product. Zabel is an alumnus of the famed Stanford Design School – the “d.school” – and firmly believes that “technology must enable the design and the design must drive the technology.”