Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More
Today, DataStax announced that it is acquiring privately-held AI vendor Kaskada, which develops a feature engineering platform that can help organizations use data for AI applications.
Of course, effective machine learning (ML) and artificial intelligence (AI) must begin with good data, typically stored in a database for querying. Event streaming data sources is another foundation of effective ML and AI, enabling real-time data to stream from any number of different locations.
Database and real-time streaming vendor DataStax has been building out its data platform since 2010, and is a leading contributor to the open-source Apache Cassandra database. In 2021, DataStax acquired Apache Pulsar vendor Kesque and launched a streaming data service. Demand for both database and event streaming have helped DataStax to grow, with the company announcing a $115 million round of funding in June 2022.
The next phase of the company’s growth will be fueled, in part, by the growing demand for AI and ML, powered by a real-time data platform.
Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.
“Machine learning is transformative to businesses, and it has to be something that you leverage daily in your business processes and in your applications,” Chet Kapoor, CEO of DataStax, told VentureBeat. “We think that we can make it possible for all types of customers to overlay AI pipelines to make it part of their business apps and business processes.”
AI is about more than just unstructured data
A good deal of the hype around modern AI is related to use cases that involve unstructured data. However, while it’s true that generative AI tools for text and images tend to work with unstructured data, that’s not the case for all AI workloads.
Ed Anuff, chief product officer at DataStax, explained to VentureBeat that package delivery, logistics, ride sharing, video streaming and other use cases rely on structured data and AI to work effectively. In those areas, organizations are tracking event-based data as interactions occur, or as locations change, all in a tabular, structured data format.
“The reality is that the majority of applications that we interact with where ML is actually being used to make our interactions more productive, on a daily basis, are the structured data use cases,” Anuff said.
Structured data is what the Apache Cassandra database works with. Vendors such as Uber and Netflix use Cassandra to help power operations. Taking structured data that’s already stored in Cassandra and using it to train AI models is where the process of feature engineering comes in.
What Kaskada brings to DataStax and the Apache Cassandra database
Kaskada has developed feature engineering technology that DataStax expects will be an ideal fit with its real-time data platform.
Anuff said that Kaskada has built a concise description language that enables a data engineer to simply describe what is needed from a dataset in order to feed an AI model. He added that the Kaskada technology is able to operate at the high throughput that’s necessary for real-time applications.
DataStax’s aim is to fit into an ML workflow, providing the data foundation and feature engineering that can be used to power inference engines for AI. Anuff emphasized that the flow of data is bi-directional, such that predictions and outcomes from AI inference can then be loaded back into Cassandra, where the result can be served to application users.
For Kapoor, the overall goal is to enable a real-time data stack that allows organizations to use operational data to help improve business outcomes.
“Our customers have a disproportionately high amount of real-time data and we are giving them an opportunity to leverage it so that they can create excellent experiences for their customers,” Kapoor said.
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.