Companies that make investments in big data see significant positive returns, but it takes them a while to figure it out. Moreover, many companies are getting tripped up by hype about needing to use Hadoop when they should really be using simpler technology.
Those are some of the views expressed by experts on big data at the Interop IT trade show event last week in New York.
Leveraging data analytics gives huge advantages to companies, these experts say, citing industry reports. Hadoop is one of the more popular technologies allowing large organizations to do queries to gleam intelligence about what their data says about their business or operations. It boasts strong performance, but is less expensive than legacy data analytics technologies because it can be distributed across commodity servers.
Netflix, Amazon, and LinkedIn, for example, implemented such big data efforts early and have done well, according to a report by Tata Consulting. But these companies were all particularly dependent on web data analytics for their business models, and they started years ago. However, many other companies are struggling with big data because they have multiple data sources and businesses process, and it’s harder to fit it all into a single data infrastructure.
Some of the nation’s largest retailers, for example, have sought to store and run all of their data on Hadoop, but they came to realize they didn’t have as much actionable data as they wanted. These companies don’t like talking about their projects publicly, but their problems are being whispered about in data analyst circles.
“The hype of using one technology got ahead of reality of organizations’ ability to drive change,” according to Paul Ross, the VP of product at Alteryx, who spoke on a big data panel at Interop that I moderated.
Hadoop can feel static for some businesses needing to crunch real-time data, because it relies on batch processing. It also requires training in how to use MapReduce, the algorithm behind Hadoop. Few companies have access to the data scientists needed to do such tasks properly. While large companies like IBM, Oracle, and SAP are working hard to provide sophisticated data infrastructure for companies, they often require a retraining of the workforce and long IT projects to integrate data from various sources. “They don’t have time for this,” Ross said, referring to the majority of companies that can’t wait for results.
Ross’ company, Alteryx, helps business leaders make decisions based on their data no matter where it is stored — Hadoop or otherwise. Alteryx, he said, is part of a wave of new vendors, including Cloudera, MongoDB, and Tableau, that are enabling companies to move more quickly — by helping them draw data from their existing business processes — wherever it comes from, be it about user demographics, website, supply chain data or other ERP processes. They then enable companies combine that with new sources of insight — like big data — and then analyze it before seeing results with visualizations.
For example, a marketing manager can user Alteryx to assess whether her social media initiatives have impacted the company’s web site sales efforts. That marketer can’t wait for her company’s chief information officer to finish implementing a large data infrastructure project of the kind run by Oracle or SAP, Ross said.
Editor’s note: Our upcoming DataBeat conference, Dec. 4-Dec. 5 in Redwood City, will focus on the most compelling opportunities for businesses in the area of big data analytics and beyond. Register today!
David Parker, SVP of Big Data, who also spoke on the panel, didn’t contradict Ross’s central point. However, he responded that SAP is also working with customers to pull data from their disparate sources and agreed that Hadoop alone won’t suffice. Only 13 percent of all companies that consider using Hadoop actually go through to use it in production, Parker said.
Datameer, meanwhile, is working with businesses to help them use Hadoop more simply, enabling them to do big data analytics without needing any training in Hadoop MapReduce. It lets them build upon excel-like spreadsheets, and perform sophisticated data-mining in a WYSIWYG format, without any coding. It requires end-user training of just one day.
Hype around specific technology is preventing companies from making the right decisions, the experts agreed. Signs abound that Hadoop is getting more than its share of hype, for example. Hadoop attracts about 90 percent of all mentions in social media conversations about big data, while other technologies get only about 10 percent, according to Steve Francia, the chief developer advocate for MongoDB, who also spoke on the Interop panel.
But when it comes to actual decision makers — the engineers who are actually using big data technology for their companies — their mentions on social media are more evenly split: only 45 percent are about Hadoop, while 45 percent are about MongoDB’s NoSQL technology, with 10 percent about other NoSQL competitors like Cassandra and other technologies, Francia said.
MongoDB’s NoSQL technology has been widely adopted by developers because of its ease-of-use. Last week, it raised $150 million from investors who said they think NoSQL is poised for significant growth. In one swoop, the investment gave MongoDB greater funding that the two commercial leaders of Hadoop, Cloudera and Hortonworks, have raised combined. It’s one sign that the market hype around Hadoop may already be correcting itself.