VentureBeat presents: AI Unleashed - An exclusive executive event for enterprise data leaders. Network and learn with industry peers. Learn More

Yesterday’s release of Meta’s LLaMA 2, under a commercial license, was undoubtedly an open-source AI mic drop. But startup Together, known for creating the RedPajama dataset in April, which replicated the LLaMA dataset, had its own big news over the past couple of days: It has released a new full-stack platform and cloud service for developers at startups and enterprises to build open-source AI — which, in turn, serves as a challenge to OpenAI when it comes to targeting developers.

The company, which already supports more than 50 of the top open-source AI models, will also support LLaMA 2.

Founded last year by Vipul Ved Prakash, Ce Zhang, Chris Ré and Percy Liang, Together says it is “on a mission to make AI models more open and accessible in a market where Big Tech players are currently leading innovation.” The Menlo Park, California-based startup announced in May that it had raised $20 million in a seed funding round to build open-source generative AI and a cloud platform.

“There is a clear debate between open-source and closed systems, and now there is an open-source ecosystem that is getting stronger,” Prakash told VentureBeat, explaining that the company is increasingly seeing enterprises move towards open source because of a desire for data privacy. And now, “there’s more adoption of open-source models because open-source models are getting stronger.”


AI Unleashed

An exclusive invite-only evening of insights and networking, designed for senior enterprise executives overseeing data stacks and strategies.


Learn More

New API and compute cloud services for leading open-source AI models

Last Friday, the company launched the Together API and Together Compute, cloud services to train, fine-tune and run the world’s leading open-source AI models. Together API is powered by “an efficient distributed training system to fine-tune large AI models, offering optimized, private API endpoints for low-latency inference.” For AI/ML research groups who want to pre-train models on their own datasets, Together Compute offers clusters of high-end GPUs paired with Together’s distributed training stack.

The result is far-greater cost efficiency, said Prakash. “It’s $4 an hour for an A100 GPU on AWS — we have created a technology where we can host instances of a model for a user — for example, hosting a RedPajama 7 billion-parameter model on an A100 on our platform is 12 cents an hour.”

Together can do that, he explained, because of something the Wall Street Journal reported on last month: a huge supply of used GPU chips left in the wake of changes in cryptocurrency mining.

“Tens of millions of GPUs became available after the Ethereum network — home to the second-biggest crypto, behind Bitcoin — removed the need for these chips by ending the practice of mining for new coins and the intensive computation it required,” the article said, and about 20% of those chips can be repurposed to train AI models. Together has leased thousands of these GPUs to help power its new cloud services.

There will be parallel closed and open ecosystems

While Together is certainly challenging OpenAI and other closed, proprietary model companies, particularly in the enterprise space, Prakash said he believes there will be parallel closed and open ecosystems.

“My personal feeling is that the closed model companies will eventually get more app-centric,” he said, pointing to Character AI and its efforts in consumer-focused chatbots. “They do that really well and their modeling efforts are sort of getting more and more focused in that direction.”

Similar to other fields, from operating systems to databases, open-source AI will be a more broadly applicable set of technologies, he explained. “I do think it will become difficult for closed models to charge a premium given that there are open solutions that exist and are now good for many problems.”

New chief scientist and Snorkel AI partnership

In addition to the platform news, Together announced this week that it has hired a new chief scientist, Tri Dao, who recently graduated with a Ph.D. in computer science at Stanford and is also an incoming assistant professor at Princeton University. Most notably, Dao is known for his breakthrough FlashAttention research to improve LLM training and inference, which is now broadly used by all Transformer-based models. FlashAttention-2 is now available, which speeds up training and fine-tuning of LLMs by up to 4 times and achieves 72% model FLOPs utilization for training on Nvidia A100s.

In addition, this week Together announced a partnership with Snorkel AI to enable organizations to build custom LLMs on their data in their secure environments. The end-to-end AI development solution spans data development, model training, fine-tuning and deployment.

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.