Microsoft presents text-to-speech AI model: can accurately simulate anyone’s voice.

davideduardo · Jan 10, 2023

"Microsoft has unveiled a new text-to-speech AI model called VALL-E which can accurately simulate anyone’s voice with just a three-second audio clip of them speaking.

As reported by Ars Technica (opens in new tab), the software giant’s researchers have demonstrated VALL-E in action in a new research paper (opens in new tab) and GitHub demo. Although it's still in its infancy, VALL-E is already impressing the scientific community — and creeping out the rest of us — with its ability to synthesize audio of a person saying anything while preserving their emotional tone."

From computer geek website "Tom's Guide".

https://www.tomsguide.com/news/microsofts-creepy-new-vall-e-ai-can-imitate-anyones-voice-in-just-3-seconds?utm_term=EBB12A53-69DD-4A25-BAA7-7179E29C6489&utm_campaign=E7CCEEA1-DDFE-442F-85F9-6D317BC29EC0&utm_medium=email&utm_content=855096AD-C560-4C32-8D08-3DDA44C62683&utm_source=SmartBrief

landtuna · Jan 10, 2023

Photoshop virtually killed the photograph as legal evidence and now VALL-E is going to kill recordings too?

wadio · Jan 10, 2023

Combined with ChatGTP to provide the content, in time there won't be much use for human beings in radio ... or anywhere else for that matter!

boombox4 · Jan 11, 2023

wadio said:
Combined with ChatGTP to provide the content, in time there won't be much use for human beings in radio ... or anywhere else for that matter!

There is a lot of talk about this on the YT music industry v-logs and also in the book publishing forums.

We've also discussed it here on RD from time to time.

AI was supposed to eliminate 50% of job positions by mid-century (or some similar timeframe), or so it was said in the newspapers in the early 2010's.

Who knows -- Skynet just might be in our future. ;-)

Microsoft presents text-to-speech AI model: can accurately simulate anyone’s voice.

davideduardo

Moderator/Administrator

landtuna

wadio

boombox4