AI avatars and synthetic video production could provide organizations with entirely new capabilities for training and multilingual global communication in the years ahead.
In the digital age, the line between the material world and simulation often blur. The realm of simulacra is very much intermeshed within an increasingly virtual reality. The proliferation of synthetic media has created a vast realm of possibilities from deepfake-enabled political misinformation to a wholly new type of computer-generated Instagram influencer. Moving forward, synthetic media and artificial intelligence (AI) could transform the way companies target global audiences and provide internal core competency training. Digital education avatars could also boost engagement and memory retention in the virtual classroom. An AI video production company, Synthesia, is working with a host of clients across industries to reshape the way organizations train, educate, disseminate information, and more.
SEE: TechRepublic Premium editorial calendar: IT policies, checklists, toolkits, and research for download (TechRepublic Premium)
Information sharing the synthetic media age
In 2019, a Malaria awareness video featuring a David Beckham deepfake caused quite a stir online. It starts unassumingly enough: Beckham pulls up a chair at a table set before a piece of toast, jam, and a glass of orange juice, but then it happens. Beckham then begins to deliver the script in not only English, but eight additional languages ranging from Kinyarwanda to Yoruba.
The feat was made possible due to Synthesia’s AI video platform. Once a speaker has been recorded, Synthesia uses a process called Native Dubbing to incorporate the audio of a voice actor. The platform then integrates this audio and the correlating facial movements onto the desired pre-recorded presenter.
Overall, the Synthesia videos we’ve seen are impressive, but, at times, the lip movements and audio can be off ever so slightly. Dr. Steve Joordens, professor of psychology at the University of Toronto, said these peculiarities may be a positive in the marketing realm.
“Something being a little off on a commercial or a public service announcement, that might actually grab attention. People might actually look at it a little bit more, and especially if it’s surprising, like David Beckham suddenly speaking a bunch of different languages,” Joordens said.
Companies can use the platform for a host of applications ranging from corporate onboarding experiences to virtual training exercises with an AI avatar. This offers numerous advantages over traditional video production.
“In traditional video production, and we’re talking live-action video here, you write a script, get an actor, get a studio, get a sound guy, get a video guy, you do the post-production, and then you have your video ready. Using AI, we’ve digitized the video production process and enabled our customers to create a video in 5 to 10 minutes, without the need for any cameras, actors or studios” said Victor Riparbelli, CEO and co-founder of Synthesia.
“The reduction in time and complexity is a major draw for most of our clients who are in the corporate training space, and the learning and development space, where policy and regulation changes often,” Riparbelli said.
Aside from the logistics and planning side, synthetic media production offers other cost-saving advantages compared to standard video production.
“Even doing a simple corporate video shoot can easily be $5,000 and take a lot of time and project management. Whereas with our technology, we are in the $2,000-$3,000 range per video, and anyone can do it in minutes straight from their browser,” Riparbelli said.
Riparbelli believes many companies would like to produce more videos, but the cost and time of production have limited these capabilities in the past. However, new technologies are enabling expeditious production at scale.
“There’s a big disconnect between how much video content companies want to create, and how much they can afford to create. Not just in dollars, but also very much in time and complexity. And what we’re doing is we’re plugging that gap by enabling video production using AI in the browser,” Riparbelli said.
In the past, many companies have used written communication to disseminate information in-house. Instead, companies can feed this material into Synesthesia’s video generation platform and transform written words into an AI-led visual presentation. This allows companies to boost the appeal of more mundane material. Riparbelli also noted studies showing people may be more likely to retain information when material is presented in a video format rather than text.
“That’s why so many companies are interested in disseminating information in the video format, rather than sending out PDFs which, maybe some people will read them and understand them, but a lot of people won’t,” Riparbelli said.
The polyglottal proficiencies illustrated in the Beckham video also enable companies to more precisely target people in dozens of global markets. As a result, companies can engage with a particular audience in a viewer’s native language.
“Because our product generates videos straight from text, it also allows you to very easily translate your content into 39 different languages that we support. Most of our clients are Fortune 1000 companies, and what they do is that they create core communications for learning content in English, and then they automatically transform it into 10 different languages,” Riparbelli said.
Aside from the multilingual capabilities, synthetic video also offers granular customization and malleability. With traditional video production, the final cut is the final product. However, the Synthesia platform allows companies to plug and play data to pinpoint and engage specific employees.
“For example, WPP rolled out a learning course for a hundred-thousand employees where, from their CRM system, we know the name of the employee, their role, the agency they work in, how long they’ve been there, [and] even their location. All that data is automatically integrated into the video. So, every [employee sees] a different video that’s tailored to them,” Riparbelli said.
The future of synthetic video: AI newscasts, virtual education avatars, and more
Outside of the corporate realm, there’s wide-ranging potential for these synthetic video capabilities across industries. News agencies could leverage the platform to create video news reports from scripts using AI avatars. One Syntehsia client is using the platform to create virtual AI presenters to enhance the distanced learning experience, according to Riparbelli.
“As the teacher, for example, you can record yourself for five minutes, send us the footage and become an AI presenter yourself,” Riparbelli said.
Once a particular child is logged in, the AI avatar can then engage with students during lessons and craft these responses with a particular student or lesson in mind, according to Riparbelli.
Although the old-fashioned, human-led, remote education model has had mixed results thus far, there may be potential developmental concerts associated with the use of synthetic media in the classroom. While the disconnect between avatar facial movements and misaligned audio may bode well in the marketing realm, the same may not hold true in education, according to Joordens.
“Our brain can bring those together and make it work, but the sound doesn’t have to be off for very much before we can detect it and it feels wrong and that’s not a good feeling to have while someone’s learning,” Joordens said. “When they’re learning, you want everything to feel comfortable and right and let them get into that experience. You don’t want something pulling their mind away and distracting them.”
In the era of remote learning, teachers around the country have been tasked with transforming the entire curricula into virtual variants on the fly. Joordens noted that he and other colleagues have had to adapt new techniques to create engaging virtual lessons during the coronavirus pandemic. This includes breaking up longer lectures into shorter components with media and other assets peppered in to increase engagement. Creating more interactive and more approachable content allows professors to compete with a host of technologies also vying for a student’s undivided attention, according to Joordens.
“My claim to a lot of my colleagues, if we’re not competing, we’re losing. That [students are] not going to watch a two-hour lecture of some person droning on when they’re getting notifications coming in about cool new stuff or shares or whatever,” Joordens said. “I think we have to embrace video, and I think we have to not just embrace video, but we have to embrace doing it right and learning from what a lot of the marketers and other people have been doing for a long time.”
SEE: Natural language processing: A cheat sheet (TechRepublic)
Whether it’s students in the remote classroom or an employee training seminar, providing a product is often about meeting people where they’d like to be met. In general, Riparbelli explained that he wants to use the Synthesia platform to make information more accessible for a greater number of people. He noted that there are some people who may not want to read an extensive book or article and would prefer to gain this information via video.
“Today, most of the world’s information exists in text format. We see our mission as [turning] all of the world’s text into bite-sized video content, and we believe that that’s going to make much more knowledge available to everyone all over the world. No matter which language you happen to speak.” Riparbelli said.