IBM is making Spark available as a cloud service on its BlueMix cloud platform. The company is releasing its SystemML software under a machine-learning license for the Spark community. IBM will open a Spark Technology Center in San Francisco. And IBM said it will teach Spark to more than 1 million data scientists and data engineers and direct more than 3,500 researchers and developers to work on projects involving Spark.
Legacy tech vendors have been slow to embrace Spark — which many see as a successor to Hadoop open-source big data software — but that’s just one reason why IBM’s set of moves today is significant.
Data has been an area of interest for IBM, and the company has moved to productize Hadoop, but Spark until now has not been a priority. IBM in recent years has bet large amounts of money on areas like the Internet of Things, software-defined storage, and Watson. Now big data is once again a focus, even if there’s no dollar amount at the top of today’s press release.
IBM’s efforts represent a potential competitive threat to San Francisco startup Databricks, which claims to have committed more than 75 percent of the code added to Spark in the past year. The main commercial product from venture-backed Databricks is a cloud service that runs on top of the Amazon Web Services public cloud. IBM bringing out Spark on Bluemix equates to a direct attack on Databricks. And if IBM can get its people committing a sizable amount of code to Spark, that, too, could challenge Databricks.
But perhaps the biggest impact here is the coming increase in adoption of Spark in general. Big Blue providing Spark could help the project look suitable for big businesses and not just for startups.
“In the enterprise, I’m seeing almost no Spark adoption,” Nick Heudecker, a Gartner analyst, told VentureBeat in an interview last month. Going forward, that should change.