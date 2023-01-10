ContributorsPublishersAdvertisers
New creepy VALL-E AI can mimic your voice and emotions exactly after just 3 seconds and experts warn of misuse

By Jona Jaupi
 5 days ago
MICROSOFT has unveiled a new AI-powered tool that can simulate a person's voice.

The tool, dubbed VALL-E, can mimic someone's voice after listening to a three-second audio sample, Microsoft revealed in a new study.

The impressive simulation can match a voice's timbre, the emotional tone behind the speech and even the room's acoustics.

Once the AI has learned a specific voice, it can generate audio of that person saying just about anything.

Researchers believe that VALL-E could be utilized for advanced text-to-speech applications and pre-recorded speech editing.

It can also generate high-quality audio content when used with other AI tools like ChatGPT.

"Experiment results show that VALL-E significantly outperforms the state-of-the-art zero-shot TTS system

in terms of speech naturalness and speaker similarity," Microsoft researchers write in the study.

However, like most AI tools, it does carry some serious risks of abuse, including the creation of audio deepfakes.

Deepfakes refer to videos of a person whose face or body has been digitally altered to appear to be someone else.

"It used to be harder to simulate a person's speech pattern than to create a deep fake image of them, no longer," Calum Chase, author of Surviving AI, tweeted about the new tool.

How does it work?

The text-to-speech AI model is referred to as a "neural codec language," per Microsoft.

Researchers trained the model by using discrete codes: "derived from an off-the-shelf neural audio codec model."

Engadget reported that the model was derived from Meta's AI-powered compression neural net Encodec.

More than 60,000 hours of pre-training went into the TTS, with researchers feeding it speech data that is "hundreds of times larger than existing systems," Microsoft said.

The tech giant shared a demo of the tool here.

