VentureBeat’s data science event, DataBeat, showcased great examples of how data science is raising the expectations for faster and better insights.
This year we devoted as much stage time as possible to use cases and provocative panel discussions, so we didn’t host one of the Innovation Showdowns that we usually include in our events, where our jury selects the best startup pitches out of a handful of finalists. However, we still wanted to tip our hat to a few companies that we found were doing a great job of powering data analysis in new and potentially disruptive ways.
Among the many lessons of the conference, one that stuck out was the way that informed human users and powerful machines can combine to achieve competitive results. Sorry, Skynet fans: This revolution will be partly driven by powerful computers, but only a few specific applications can afford to rely solely on algorithms. A much larger area of focus consists in facilitating the human decision-making part, after data has been qualified.
We could see two distinct approaches emerging in the applications we received.
The first approach looks at how businesses capture, store, and process data, then works at making the corresponding tools faster, leaner, and stronger, to accommodate the ever growing amounts of available data and cut the time to take actions on it. In this approach, you measure outcomes in terms of technical performance and speed. We call this approach infrastructure augmentation.
The second category of approaches looks at how to democratize these tools by letting more business users benefit from packaged intelligence solutions, and leveraging human skills where they are still better than computers’. The techniques in this category can range from picking suggestions after algorithms have identified correlations in the data, to making the insights more visual for easier discovery, to enabling collaboration on the business data. In this approach, the outcome is measured in business performance and speed. This is what we call intelligence augmentation.
Here are some of the most promising companies we found in each of these two categories.
As the information available to a business grows, so too does the index used to query that data store. ParStream developed a parallel index to enable queries to be run on structured and un-structured data alike, without need for decompression, cutting the time and size needed to mine even the biggest heaps of data.
Another company working on the index challenge, Totutek provides a distribution of MongoDB that provides up to 90 percent savings in the disk space needed to store the data, as well as faster throughput when writing to the database and ensuring concurrency control.
While focused on the retail and mobile advertising verticals, Placed provides an interesting example of how technical know-how ends-up providing cleaner, sounder data. By matching mobile ad impressions with visits to physical stores through anonymized hashed device IDs, Placed is closing the gap of mobile conversion attribution.
An increasing number of businesses rely on subscription models to sell their products and services. This means that, for these companies, they not only have to worry about customer acquisition — they also need to limit churn. Gainsight enables “customer success” teams with easy-to-use dashboards that display indicators and alerts of potential churn based on both human-entered confidence grades and patterns mined from CRM data.
When the database is large enough to be sure it holds the answer, all it takes is being able to ask the right question. DataRPM uses natural language processing to bypass queries and give non-technical users access to data. It also makes suggestions based on correlations found in the data.
Ayasdi started as a DARPA-funded research project. It takes an interesting approach: It translates data into topological objects to provide analysts with 3D models where insights “pop” on the screen as areas of interest. The solution is used in industries as diverse as healthcare, oil & gas, and finance.