Head over to our on-demand library to view sessions from VB Transform 2023. Register Here
Hollywood actors and writers are currently striking, and one of their biggest concerns is the impact of generative AI on their industry and their jobs. In a news conference last Thursday, Fran Drescher, president of the Screen Actors Guild-American Federation of Television and Radio Artists (SAG-AFTRA) union, said AI poses an “existential threat to creative professions, and all actors and performers deserve contract language that protects them from having their identity and talent exploited without consent and pay.”
However, a flock of high-flying generative AI video startups, including Synthesia, Hour One and Soul Machines, don’t see it that way. They view AI-generated avatars, or digital humans, as filled with powerful creative potential for business, Hollywood, and celebrities who consent to the use of their AI likenesses.
Tackling the challenges of traditional video production
Last November, for example, VentureBeat spoke with Natalie Monbiot, head of strategy at synthetic media company Hour One, who said she dislikes the word “deepfakes.” “Deepfake implies unauthorized use of synthetic media and generative artificial intelligence — we are authorized from the get-go,” she told VentureBeat.
The idea, she explained, is that businesses can use synthetic media — in the form of virtual humans — to tackle the expensive, complex and unscalable challenges of traditional video production, especially at a time when the hunger for video content seems insatiable. In addition, synthetic media allows businesses to quickly and easily offer content in different languages, as well as to produce promotional video content at scale.
VB Transform 2023 On-Demand
Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.
Just today, for example, Los Angeles-based startup Soul Machines, which recently added a ChatGPT integration to its “digital person” product, announced a partnership with K-Pop celebrity Mark Tuan, a member of boy band GOT7, with the launch of “Digital Mark.” The company claimed the launch is the “first time a celebrity is attaching their likeness to GPT,” allowing Tuan’s social following of 30 million fans to have one-on-one conversations with “him” on virtually any topic.
A press release said that as K-Pop’s fan base continues to grow across the globe, Tuan’s new digital twin will “enable him to speak in multiple languages — starting with English but adding Korean and Japanese language capabilities in the near future.”
Synthesia CTO calls digital humans a ‘natural progression’ for video creativity
Jon Starck, chief technology officer at the London-based startup Synthesia, which recently hit a $1 billion valuation for its AI-powered platform that helps businesses generate promotional or educational videos from plain text — and got an infusion of funding from Nvidia — said that AI-powered digital humans have both creative and efficiency potential that can’t be ignored.
“Video is a very creative thing. It’s a storytelling thing. It’s very visual and engaging,” he said. “But the whole process of creating video is probably the least creative thing you can imagine.” With today’s AI-powered video generation opportunities, “everyone becomes a great storyteller,” he added.
Starck told VentureBeat this is a “natural progression” from previous AI-generated efforts in film and says the future could hold an entire movie made from synthetic data.
It’s a bold statement, but Starck has been working on digital humans for two decades, when “nobody had ever heard of computer vision” and he was working in the film industry, bringing 3D computer vision to technical artists working on movies.
The problems he is working on now are “exactly the same problems we were working on 20 years ago,” he said. “I used to have eight cameras, now I’ve got 78 cameras. Now there are 24-megapixel cameras. Now we have the capability of solving the problems that I couldn’t [before].”
Using actors to get the best dataset of high-fidelity human performance
Synthesia’s researchers have taken a big step towards solving one of the thorniest computer vision problems: representing human performance at high fidelity, an essential building block in applications from film production and computer games to video conferencing. Right now, for example, AI tools like Synthesia’s are two-dimensional and don’t show a human being fully in motion with a 360-degree view, like you would see in a TV advertisement or a movie.
To close the gap to production-level video quality, Starck and his team recently released HumanRF, an AI research project that captures a human being’s full-body appearance in motion from multi-view video input, and enables playback from novel, unseen viewpoints.
To meet this challenge, Synthesia researchers needed to create a high-fidelity dataset of clothed humans in motion — which required, ironically, real actors.
The company created the dataset, called ActorsHQ — consisting of 39,765 frames of dynamic human motion captured using multi-view video with a proprietary multi-camera capture system — by accessing the movements and performances of real actors in a U.K. studio, including some who are already available as avatars on the Synthesia platform.
The actors “wanted to come back and be part of this future of potential 3D representations for 3D synthetic actors,” said Starck.
Asked about the complaints of Hollywood’s striking writers and actors, Starck emphasized that Synthesia is not in the movie business. “We’re not replacing actors,” he said. “We’re not replacing movie creation. We’re replacing text for communication. And we’re bringing synthetic video to the toolbox for businesses.”
That said, he said that from a personal standpoint, as someone who has worked in visual effects, he sees every invention as a new enabler.
In the movie industry, he explained, it could take 18 months and millions of dollars to produce a couple of seconds of a blockbuster movie.
“There are hundreds of artists sitting in dark rooms with very complicated tooling to be able to produce very exact results,” he said. “My view on the AI explosion is this is something that enables creativity for humanity.”