This is big news, right?
However, Microsoft has been unusually quiet about the new artificial intelligence. There were no press releases or other significant announcements this week.
VALL-E has sent ripples throughout the tech community. Despite the company's uncharacteristic marketing reluctance in regards to the launch, the implications of the neural codec language model are undeniable.
But, What Does This Mean?
On the one hand, the potential for this technology to revolutionize industries such as customer service and podcasting is staggering. On the other hand, the prospect of scammers and cybercriminals utilizing this technology to impersonate individuals is a genuine concern.
At the moment, it's impossible to know just how good VALL-E is since Microsoft has yet to release the tool to the public, although it has provided samples of the work that's been done. It's very impressive if that mimicry took only three seconds, and the voice could go on to speak for any length of time.
If it's as good as Microsoft says it is, while providing human characteristics like charisma, one could see why Microsoft is reportedly in talks to invest $10 billion into OpenAI LLC's ChatGPT.
VALL-E's Scammer Concerns
Microsoft trained the new VALL-E TTS system on 60,000 hours of English language speech. The tech firm used Meta's LibriLight audio library, which has over 7,000 audio recordings.
Surprisingly, the TTS tech can copy diction and speakers' speech. Most of VALL-E's audio is so similar that you won't notice any differences from the original ones.
This is where the problem starts.
The potential for scam artists goes through the roof. If a scam artist can get you to talk on the phone for three seconds, they are able to steal your voice. Imagine if they called your grandma, or bypassed a voice-recognition security device.
What's The Solution?
Researchers say that it is building a detection system.
This may leave people wondering: "Why did you do this at all, then?"
Quite often in medical and technological advancements, the answer is: "Because we could."
Creative AIs like DALL-E, ChatGPT, various deepfake algorithms, and countless others makes it feel like we are at an inflection point where these technological advances are breaking out of laboratories and into the real world.
As with all change, it brings exciting opportunities along with risks. We truly live in interesting times.