Are you ready to bring more awareness to your brand? Consider becoming a sponsor for The AI Impact Tour. Learn more about the opportunities here.

AI startup CentML, which optimizes machine learning models to work faster and lower compute costs, emerged from stealth today. The Toronto-based company aims to help address the worldwide shortage of GPUs needed for training and inference of generative AI models.

According to the company, access to compute is one of the biggest obstacles to AI development, and the scarcity is only going to increase as inference workloads accelerate. By extending the yield of the current AI chip supply and legacy inventory without affecting accuracy, CentML says it can increase access to compute in what it calls a “broken” marketplace for GPUs.

Hard for smaller companies to access GPUs

CentML raised a $3.5 million seed round in 2022 led by AI-focused Radical Ventures. Cofounder and CEO Gennady Pekhimenko, a leading systems architect, told VentureBeat in an interview that when he saw the trajectory of the size of large language models, it was clear that whoever owned the hardware and the software stack on top of them would have a dominant position.

>>Follow VentureBeat’s ongoing generative AI coverage<<

VB Event

The AI Impact Tour

Connect with the enterprise AI community at VentureBeat’s AI Impact Tour coming to a city near you!


Learn More

“It was very transparent what was happening,” he said, adding with a laugh that even he put his money into Nvidia, which controls about 80% of the GPU market. But Nvidia, he explained, always wants to sell its most expensive chips, like the latest A100 and H100 GPUs, and that has made it hard for smaller companies to get access. Yet Nvidia has other, less expensive chips that are poorly utilized: “We build software that optimizes those models efficiently on all the GPUs available, not just on the most expensive available in the cloud,” he said. “We’re essentially serving a larger part of the market.”

As the cost of inference grows “exponentially” (models like ChatGPT cost millions of dollars to run), CentML uses a powerful open-source compiler to automatically tune optimizations to work best for a company’s specific inference pipeline and hardware.

A competitor like OctoML, Pekhimenko said, is also built on compiler technology to automatically maximize model performance, but an older technology. “Their solution is not competitive in the cloud. We knew what the deficiencies were and built a new technology that doesn’t have those deficiencies,” he said. “So we have the benefit of coming second.”

Race to access AI chips has become like Game of Thrones

David Katz, partner at Radical Ventures, says the battle to get access to AI chips has become like Game of Thrones — though less gory. “There’s this insatiable appetite for compute that’s required in order to run these models and large models,” he told VentureBeat, adding that Radical invested in CentML last year.

CentML’s offering, he said, creates “a little bit more efficiency” in the market. In addition, it demonstrates that complex, billion-plus-parameter models can also run on legacy hardware.

“So you don’t need the same volume of GPUs or you don’t need the A100s necessarily,” he said. “From that perspective, it is essentially increasing the capacity or the supply of chips in the market.”

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.