Google’s DeepMind Claims Massive Progress in Synthesized Speech

Researchers at Google’s DeepMind artificial intelligence division claim to have come up with a way of producing much more natural-sounding synthesized speech, compared with the techniques that are currently in use. Existing text-to-speech (TTS) systems tend to use a system called concatenative TTS, where the audio is generated by recombining fragments of recorded speech. There’s also a technique called parametric TTS that generates speech by passing information through a vocoder, but that sounds even less natural. So DeepMind has come up with a new technique called WaveNet that learns from the audio it’s fed, and produces raw audio sample-by-sample. To give an idea of how detailed that is, we’re talking at least 16,000 samples per second. Get Data Sheet, Fortune’s technology newsletter. A WaveNet is a “neural network”—essentially an artificial…


Link to Full Article: Google’s DeepMind Claims Massive Progress in Synthesized Speech