Were you unable to attend Transform 2022? Check out all of the summit sessions in our on-demand library now! Watch here.


Dubbing, where recordings in other languages are lip-synced and mixed with a show’s original soundtrack, is an exploding business. One localization platform, Zoo Digital, saw revenues jump by 73% to $28.6 million in July 2018 compared to the year prior. Another, BTI Studios, told Television Business International that dubbing grew from 3% of its revenue in 2010 to 61% in 2019.

According to Verified Market Research, the film dubbing market alone could reach $3.6 billion in worth by 2027, growing at a compound annual growth rate of 5.6% from 2020. But barriers stand in the way of expansion. On average, it can take an hour of recording studio time for five minutes of narration — not to mention an actor who can deliver it with rhythm and timing that matches the original audio as close as possible. It’s also costly. One calculator pegs the price at $75 per minute for even a simple video.

AI technologies promise to streamline the process by automating various aspects of dubbing. Video- and voice-focused firms including Synthesia, Respeecher, Resemble AI, and Papercup have developed AI dubbing tools for shows and movies, as has Deepdub. Reflecting investors’ interest, Deepdub today announced that it raised $20 million in a series A round led by Insight Partners with participation from Booster Ventures, Stardom Ventures, Swift VC, and angel investors.

Dubbing with AI

Deepdub, an Israeli startup headquartered in Tel Aviv, was cofounded in 2019 by brothers Nir Krakowski and Ofir Krakowski. The company provides AI-powered dubbing services for film, TV, gaming, and advertising, powered by neural networks that split and isolate voices and replace them in the original tracks.

Event

MetaBeat 2022

MetaBeat will bring together thought leaders to give guidance on how metaverse technology will transform the way all industries communicate and do business on October 4 in San Francisco, CA.

Register Here

Deepdub attempts to layer in novel accents while retaining original actor voices across languages, using synthetic voices based on samples from the actors in one language to form foreign words using that actor’s synthesized voice. The company’s algorithms — which can’t yet dub voices in real time, but plug into post-production systems — try to match the lip sync or lip movement in the source media in multiple languages at once.

“With the increased global demand for local experiences by audiences, there is a growing demand for high-quality dubbing solutions that can scale fast and that offer a new level of experience. Audiences have more and more options to choose from, and content platforms seek ways to differentiate in the local markets,” Nir Krakowski told VentureBeat via email. “[Our] AI models enable replication of voice characteristics using not more than three minutes of voice data. The AI-generated voice in the target language can then perform at any level of expressivity, including crying, shouting, talking with food in their mouth, [and more] — even if the original voice data did not include that information.”

Deepdub’s work includes “Every Time I Die,” a 2019 English-language thriller that the startup dubbed into Spanish and Portuguese. It marks one of the first time entire dubbed movies use voice clones based on the voices of the original cast.

“[We’re] working with multiple Hollywood studios on various projects that are currently proprietary. Since the public launch back in December 2020, Deepdub has already inked a multi-series partnership with Topic.com to bring their catalog of foreign TV shows into English,” the Krakowski brothers said. “The funds from [the latest round] will be used to expand the global reach of the company’s sales and delivery teams, bolster the Tel Aviv-based R&D team with additional researchers and developers, and further enhance the company’s … deep learning-based localization platform.”

Potential pitfalls

According to Statista, 59% of U.S. adults said that they’d prefer to watch foreign language films dubbed into English than see the original feature with subtitles. One need look no further than dubbed series like Squid Game (which was originally in Korean) and Dark (German) for evidence; an October 2021 poll found that 25% of Americans had seen Squid Game, which remains Netflix’s most-watched original show of all time.

Beyond startups, Nvidia has been developing technology that alters video in a way that takes an actor’s facial expressions and matches them with a new language. Veritone, for its part, has launched a platform called Marvel.ai that allows content producers to generate and license AI-generated voices.

But localization isn’t as straightforward as simple translation. As The Washington Post’s Steven Zeitchik points out, AI-dubbed content without attention to detail could lose its “local flavor.” Expressions in one language might not mean the same thing in another. Moreover, AI dubs pose ethical questions, like whether to recreate the voice of a person who’s passed away. And the technology might lead to a future in which millions of viewers may never be exposed to a language outside their own.

Also, murky are the ramifications of voices generated from working actors’ performances. In a lawsuit last spring, Bev Standing, a voice actor, alleged that TikTok used — but didn’t compensate her for — an AI text-to-speech feature. The Wall Street Journal reports that more than one company has attempted to replicate Morgan Freeman’s voice in private demos. And studios are increasingly adding provisions in contracts that seek to use synthetic voices in place of performers “when necessary,” for example to tweak lines of dialogue during post-production.

Deepdub says that it’s not engaged in creating “deepfakes” and works with performers to ensure that they understand how their voices are being used. We’ve reached out to the company with a detailed list of questions and will update this piece once we receive a response.

“The pandemic has changed forever the way people across the globe consume content, whether it’s entertainment content using one of the many streaming services and platforms or via e-learning platforms, podcasts, and audiobooks. This results in audiences getting more and more exposure to content that is not in their local language,” Krakowski continued. “[Deepdub] started 2021 with seven employees and grew to over 30 employees at the beginning of 2022. We expect to double that size by the end of the year.”

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.