Intel moves into 'big data' software with Apache Hadoop distribution

Hoping to get behind the "big data" trend in the enterprise, Intel said today that it would launch open source data center software of its own in an effort to speed processing for big data applications.

Big data is a buzzword in the enterprise these days, with the potential to transform business models and shake up industries such as health care. But it involves a huge amount of data processing. Sometimes running apps with a terabyte of data on an Apache Hadoop distribution can take four hours, said Intel vice president of architecture group Boyd Davis.

"We're in an era of generating huge amounts of data," Davis said. "But the key is what we get out of that."

So Intel is launching its own Apache Hadoop distribution that is optimized for its solutions. If you apply Intel's latest Xeon server processors, solid-state memory, expanding the Ethernet speed to 10 gigabits a second -- then the result is that the four-hour processing can be reduced to just 7 minutes, Davis said. Intel's Apache Hadoop distribution will be commercially available, so that its competitiveness will be tested in the market against rival solutions, Davis said.

"We want to accelerate Hadoop," he said.

Intel wants to provide data for analytics purposes, and it wants to provide a framework to connect and manage devices within a entire corporation in a scalable manner. And Intel will bake Hadoop directly into its server chips so it can improve security via encryption.

Davis said that the Intel-Hadoop distribution should be simple for IT managers because it is configured to take guessing out of the process. Intel has a couple of dozen partners to help deliver its software into the market. Those include Cisco, Dell, and SAP. Intel is also investing in small big data companies, including MongoDB and Guavus Analytics, to build new analytics on Apache Hadoop.

More