Did you miss a session from the Future of Work Summit? Head over to our Future of Work Summit on-demand library to stream.
Synthesia, a company leveraging AI to generate videos of avatars, today announced that it raised $50 million in a series B round, bringing its total raised to $66.5 million. Kleiner Perkins led the round with participation from GV, FirstMark Capital, LDV Capital, Seedcamp, MCC Ventures, and individual investors, which CEO Victor Riparbelli says will be put toward supporting growth and advancing Synthesia’s technology.
As pandemic restrictions make conventional filming tricky and risky, the benefits of AI-generated video have been magnified. According to Dogtown Media, under normal circumstances, an education campaign might require as many as 20 different scripts to address a business’ worldwide workforce, with each video costing tens of thousands of dollars.
Synthesia says its technology can reduce the expense to as low as $30.
“Synthesia is focused on reducing the friction of video creation and making it possible for anyone to create professional-looking videos in minutes, directly from their browser,” Riparbelli told VentureBeat via email. “Synthesia’s first commercial product, Synthesia Studio, launched in public beta in the summer of 2020. It is now used by thousands of companies, including several Fortune 500 companies.”
Generating synthetic videos
Like rivals Soul Machines, Brud, Wave, Samsung-backed STAR Labs, and others, Synthesia employs a combination of AI techniques to create visual chatbots, product demonstrations, and sales videos for clients without actors, film crews, studios, or cameras. Founded in 2017 by Riparbelli, Steffen Tjerrild, and computer vision researchers Matthias Niessner and Lourdes Agapito, Synthesia claims to have generated more than six million videos for over 4,000 clients — including SAP and Accenture — in the last year alone.
Synthesia customers choose from a gallery of in-house, AI-generated presenters or create their own by recording about 5 to 40 minutes’ worth of voice clips. After typing or pasting in a video script, Synthesia generates a video “in minutes” with custom backgrounds and an avatar that mimics a person’s facial movements and how they pronounce different phonemes, the units of speech distinguishing one word from another.
Synthesia says that client CraftWW used its platform to ideate an advertising campaign for JustEat in the Australian market featuring an AI-manipulated Snoop Dogg. The company also worked with director Ridley Scott’s production studio to create a film for the nonprofit Malaria Must Die, which translated David Beckham’s voice into over nine languages. And it partnered with Reuters to develop a prototype for automated video sport reports.
Synthesia recently made generally available a product that personalizes videos to specific customer segments. Aptly called Personalize, it can translate videos featuring actors or staff members into over 40 languages. Wired reports that more than 35 partners at EY, formerly Ernst & Young, have used Personalize to create what they call “artificial reality identity,” or ARIs — client presentations and emails with synthetic video clips starring virtual body doubles of themselves.
“Our core use case today learning and development and internal communications videos, where the front-facing AI avatars work really well,” Riparbelli said. “The new investment will partly go to expand our core AI platform, which will allow more use cases.”
Some experts have expressed concern that tools like Synthesia’s could be used to create deepfakes, or AI-generated videos that take a person in an existing video and replace them with someone else’s likeness. The fear is that these fakes might be used to do things like sway opinion during an election or implicate a person in a crime.
Recently, a group of fraudsters made off with $35 million after using forged email messages and deepfake audio to convince an employee of a United Arab Emirates company that a director requested the money. And just last month, Japanese police arrested a man for using deepfake technology to effectively unblur censored pornographic videos.
“[Deepfake technology is] becoming cheaper and more accessible every day … Audiographic evidence must be viewed with greater skepticism and must meet higher standards,” researchers in a new study on deepfakes commissioned by the European Parliament’s Technology Assessment Committee wrote. “[Individuals and institutions] will need to develop new skills and methods to construct a trustworthy picture of reality as they will inevitably be confronted with deceptive information.”
For its part, over-60-employee Synthesia says it has posted ethics rules online and vets its customers and their scripts. It also requires formal consent from a person before it will synthesize their appearance and refuses to touch political content.
“Our main competition is text — boring PDFs that people don’t read. Synthesia is driving a paradigm shift in how we create video content,” Riparbelli continued. “Looking into the next decade of Synthesia, we’re building for a future where you can create Hollywood-grade video on a laptop. On our way there, we’ll be solving some of the hardest and most fundamental problems in AI and computer vision. With the new funds, we’ll invest even deeper in advancing our core AI research to accelerate this vision. In parallel, we will also slowly open up some of our research to the world and begin actively contributing to the broader research community.”
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn More