Google tends to explain quite often how to work with different machine learning models to improve the performance of your applications. He recently did explaining how they worked the radars of Soli and now explains how to operate the new improvements in audio quality of the Google Duo.
What has been achieved with a new model called the WaveNetEQ, a system of generation based on the technology of DeepMind able to complete the packets lost in the waves of voice. Let’s see how he manages to Google such a feat.
Improved Google quality audio Duo via a generative model
Google account that when you transmit a call through internet the packages have some quality problems. These problems are due to excessive fluctuations or delays in the network, and may lose as much as 8% of the total content due to e these.
To make sure that the communication works correctly in real time, Google has created WaveNetEQa PLC system (programmable logic controller) trained with a generous base of voice data. What makes this model? The explanation is quite technical and complex, so that we will summarize it in the simplest way.
WaveNetEQ is a generative model that allows to synthesize the waves of voice even though they have lost parts of the wave. Surely you’ve ever had bad quality on video call and have heard a sound robotic or metallic. This is because when the lack of packages is high (there is a lot of latency) and the sound may not be reproduced with quality.
Using neural networks Google is able to give continuity to the signal in real time, minimizing the loss of quality. Basically, thanks to the speech database, the model “guess” what you intend to say and complete the wave with fragments missing.
These improvements will begin to be applied already in the Google Duo Google Pixel 4although the company ensures that the model will come to the rest of the terminals in a littleit will be an improvement of the application, not only of the devices.
More information | Google
it was originally published in