While plenty of companies have reservations about floating their most private data up to the cloud, public cloud providers such as Amazon Web Services have been introducing new big-data tools one after another to entice companies to quickly store and process more of their data jewels externally.
The latest example of cloud providers trying to catch the attention of businesses with new big data products comes from Amazon cloud competitor GoGrid, which today announced a tier of service called Raw Disk Cloud Servers. They allow companies to store huge collections of data right next to the computer chips that process the data.
Raw Disk Cloud Servers can pair well with Hadoop, an increasingly popular ecosystem of tools for storing and analyzing lots of different kinds of data. The Hadoop file system can replicate a piece of data on three different disk drives to ensure the data will be kept available even if one of the disks goes down, and GoGrid can ensure that big clumps of data can all sit on three of these new volumes. Each disk can handle 4 terabytes.
Prices for the service begin at 60 cents an hour, and $2,628 per year.
GoGrid previously had dedicated servers available, as well as slices of servers backed up with fast solid-state drives (SSDs), but this option is better suited for Hadoop’s needs. “We needed to ensure customers could continue to tune their applications as needed. This is why we created the Raw Disk Cloud Server,” Kole Hicks, GoGrid’s senior director of products, told VentureBeat by email.
In November, GoGrid announced an option to quickly and easily set up Hadoop and the HBase nonrelational database, relying on a partnership with Cloudera, a company with an open-source distribution of Hadoop. That option is still in early-adopter status. Once it becomes available for anyone to try, it and the new Raw Disk Cloud Servers could make for a compelling pairing for companies wanting to move more of their analytics data to a public cloud.
The thing is, GoGrid, which launched in 2008, has a small fraction of the public cloud market share. The company claims more than 15,000 customers, including Conde Nast Digital and game developer Harmonix. But other public clouds larger than GoGrid already have Hadoop-in-the-cloud options available. Amazon, which runs the largest public cloud, has been widening its big data portfolio with a commercially supported version of Cloudera’s Hadoop distribution and support for Impala, a tool to quickly query data sitting in Hadoop.
Cloudera has also recently brought Hadoop to other public clouds, such as IBM’s Softlayer, Verizon Terremark, and CenturyLink’s Savvis.
Public cloud providers can’t necessarily reap huge revenues from these options, though, for a couple of reasons. First, some companies believe Hadoop is not reliable or secure enough, even though it’s been around since 2006. And second, some companies are reluctant to use public clouds in the first place, preferring to keep using their own, internal data center infrastructure. Revelations last year about the National Security Agency’s ability to look at certain data sitting in clouds have not helped the cloud cause.
So it’s sort of like a waiting game for Hadoop-in-the-cloud to take off. That should happen eventually. At least now GoGrid looks more prepared to handle big deployments.