Big Data

This startup wants to make your cloud-based data science work a little easier

Image Credit: jDevaun/Flickr

Data scientists generally like using open-source tools. But setting them all up every time can get tiresome.

A startup called Yhat understands. Yhat has gone and pre-loaded lots of libraries in the popular R and Python languages inside of software that data scientists can start using by the hour atop Amazon Web Services’ popular public cloud.

“The Ubuntu-based application stack ships with hundreds of well-known packages for R and Python, RStudio Server and IPython Notebook, and a dead-simple package and environment management system,” Colin Ristig, Yhat’s associate product manager wrote in a blog post this morning announcing the release of the software, dubbed Sciencebox.

The tool fits in with a class of applications emerging to help data scientists, who are often highly valued and well paid by companies big and small, but still few in number in comparison with, say, salespeople.

Domino Data Labs and Sense have been developing tools that help multiple data scientists work together on data and models. And with the launch of cloud-based software from Mode Analytics, now data analysts have a way to see what their colleagues have done before, which can save time.

Yhat is intended less as a collaboration tool and more as a way to speedily build predictive models and embed them into existing applications. With Sciencebox, that process should be less burdensome, as data scientists can spend less time before starting to run complex computations in Amazon’s cloud.

With Sciencebox, data scientists can quickly start using Ggplot, Matplotlib, Numpy, and lots of other libraries.

Not including the cost of Amazon’s cloud infrastructure, Sciencebox prices start at 4.4 cents an hour.

New York-based Yhat has taken on seed funding from Boldstart Ventures, Contour Venture Partners, and RRE Ventures, Brooklyn reported in December.

VentureBeat is studying social media marketing. Chime in, and we’ll share the data with you.