Head over to our on-demand library to view sessions from VB Transform 2023. Register Here
Natalie Monbiot, head of strategy at synthetic media company Hour One, dislikes the word “deepfakes.”
“Deepfake implies unauthorized use of synthetic media and generative artificial intelligence — we are authorized from the get-go,” she told VentureBeat.
She described the Tel Aviv- and New York-based Hour One as an AI company that has also “built a legal and ethical framework for how to engage with real people to generate their likeness in digital form.”
Authorized versus unauthorized. It’s an important delineation in an era when deepfakes, or synthetic media in which a person in an existing image or video is replaced with someone else’s likeness, has gotten a boatload of bad press — not surprisingly, given deepfakes’ longstanding connection to revenge porn and fake news. The term “deepfake” can be traced to a Reddit user in 2017 named “deepfakes” who, along with others in the community, shared videos, many involving celebrity faces swapped onto the bodies of actresses in pornographic videos.
VB Transform 2023 On-Demand
Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.
And deepfake threats are looming, according to a recent research paper from Eric Horvitz, Microsoft’s chief science officer. These include interactive deepfakes, that offer the illusion of talking to a real person, and compositional deepfakes, with bad actors creating many deepfakes to compile a “synthetic history.”
Most recently, news about celebrity deepfakes has proliferated. There’s the Wall Street Journal coverage of Tom Cruise, Elon Musk and Leonardo DiCaprio deepfakes appearing unauthorized in ads, as well as rumors about Bruce Willis signing away the rights to his deepfake likeness (not true).
The business side of the deepfake debate
But there is another side to the deepfake debate, say several vendors that specialize in synthetic media technology. What about authorized deepfakes used for business video production?
Most use cases for deepfake videos, they claim, are fully authorized. They may be in enterprise business settings — for employee training, education and ecommerce, for example. Or they may be created by users such as celebrities and company leaders who want to take advantage of synthetic media to “outsource” to a virtual twin.
The idea, in these cases, is to use synthetic media — in the form of virtual humans — to tackle the expensive, complex and unscalable challenges of traditional video production, especially at a time when the hunger for video content seems insatiable. Hour One, for example, claims to have made 100,000 videos over the past three and a half years, with customers including language-learning leader Berlitz and media companies such as NBC Universal and DreamWorks.
At a moment when generative AI has become part of the mainstream cultural zeitgeist, the future looks bright for enterprise use cases of deepfakes. Forrester recently released its top 2023 AI predictions, one of which is that 10% of Fortune 500 enterprises will generate content with AI tools. The report mentioned startups such as Hour One and Synthesia which “are using AI to accelerate video content generation.”
“That sounded very bullish … probably even to me,” said Monbiot. “But as the technology matures and massive players are getting into this space, we’re seeing disruption.”
The business side is a “hugely under-appreciated” part of the deepfakes debate, insists Victor Riparbelli, CEO of London-based Synthesia, which describes itself as an “AI video creation company.” Founded in 2017, it has more than 15,000 customers, a team of 135 and is “growing in double-digits every month.” Among its clients are fast-food giants including McDonald’s, research company Teleperformance and global advertising holding company WPP.
“It’s very interesting how the lens has been very narrow on all the bad things you could do with this technology,” Riparbelli said. “I think what we’ve seen is just more and more interest in this and more and more use cases.”
A living video that you can always edit
It’s difficult to access quality content and most businesses don’t have the skills to enable high-grade content creation, said Monbiot.
“Most businesses don’t have people that have any skills that enable content creation, especially high-grade content creation featuring actual talent, and they also don’t have the ability to edit videos or have these kinds of resources in-house,” she explained. Hour One is a no-code platform, so even users with no prior skills in creating content can select from a range of virtual humans or become one themselves.
Berlitz, one of Hour One’s first enterprise clients, needed to digitally transform after 150 years offering classroom learning. “To keep the instructor in the content, they do live videoconferencing, but that doesn’t really scale,” Monbiot said. “Even if they had all the production resources in the world, the cost and the investment and the management of all of those files is just insane.” She added that with AI, the content can be continually updated and refreshed. Now, Berlitz has over 20,000 videos in different languages created with Hour One.
Meanwhile, Synthesia said its AI is trained on real actors. It offers the actors’ images and voices as virtual characters clients can choose from to create training, learning, compliance and marketing videos. The actors are paid per video that’s generated with them.
For enterprise clients, this becomes a “living video” that they can always go back to and edit, Riparbelli explained.
“I think we actually work for almost all the biggest fast-food chains in the world by now,” he said. “They need to train hundreds of thousands of people every single year, on everything … how to stay safe at work, how to deal with a customer complaint, how to operate the deep fryer.”
Before, he said, a company might record a few videos, but they would be very high-level and evergreen. All other training would likely be via PowerPoint slides or PDFs. “That isn’t a great way of training, especially not the younger generation,” he said. Instead, they now create video content — to replace not the original video shoots, but the text options.
Authorization agreements are key
Hour One guides users through the process to get the highest-quality video capture in front of a green screen. The base footage becomes the training data for the AI.
“We basically create a digital twin of that person — for example, a CEO,” said Monbiot. “The CEO would sign an agreement allowing us to take the footage and create a virtual twin.” Another portion of the agreement would specify who is authorized to create content with the virtual twin.
“We want people to have a very positive, comfortable, pleasant experience with our virtual human content,” she said. “If people feel a little confused or uneasy, that creates distrust, and that’s very antithetical to why we do what we do.”
According to Synthesia, this kind of authorization is common in all kinds of licensing agreements that already exist.
“Kim Kardashian has literally licensed her likeness to app developers to build a game that grossed billions of dollars,” said Riparbelli. “Every actor or celebrity licenses their likeness.”
Offering influencers their images at scale
One synthetic media company, Deepcake, is leaning less into the enterprise space and more into the business of authorized deepfakes used by celebrities and influencers for brand endorsements. For example, the company created a “digital twin” of Bruce Willis to be used in an advertisement for Russian telecom company MegaFon. This led to the rumor that Deepcake owns the rights to Willis’ digital twin (which they do not).
“We work directly with stars with talent management agencies, to develop digital twins ready to be put into any type of content, like commercials for TikTok,” said CEO Maria Chmir. “This is a new way to produce the content without classic assets like constantly searching the locations and a very long and expensive post-production process.”
There are also fully-synthesized people who can become brand ambassadors for a few dozen dollars, she added. Users simply enter the text that these characters have to say.
“Of course you can’t clone charisma and make a person improvise, but we’re working on that,” she said.
The future of authorized deepfakes
Synthesia says it is adding emotions and gestures into its videos over the coming months. Hour One recently released 3D environments to create a “truly immersive” experience.
“If you think of the maturity of the AI technology, every time we move up that scale, we unlock more use cases,” said Riparbelli. “So next year, I think we’ll see a lot of marketing content, like Facebook ads. We’re just generally going to see a lot less text and a lot more video and audio into communication we consume every day.”
The enterprise use cases around synthetic media “deepfakes” are just beginning, said Monbiot, who added: “But this economy has already begun.”
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.