"Microsoft has unveiled a new text-to-speech AI model called VALL-E which can accurately simulate anyone’s voice with just a three-second audio clip of them speaking.
As reported by Ars Technica (opens in new tab), the software giant’s researchers have demonstrated VALL-E in action in a new research paper (opens in new tab) and GitHub demo. Although it's still in its infancy, VALL-E is already impressing the scientific community — and creeping out the rest of us — with its ability to synthesize audio of a person saying anything while preserving their emotional tone."
From computer geek website "Tom's Guide".
As reported by Ars Technica (opens in new tab), the software giant’s researchers have demonstrated VALL-E in action in a new research paper (opens in new tab) and GitHub demo. Although it's still in its infancy, VALL-E is already impressing the scientific community — and creeping out the rest of us — with its ability to synthesize audio of a person saying anything while preserving their emotional tone."
From computer geek website "Tom's Guide".