Google has AI that can filter a voice from a crowd
Years of exposure to music that is too loud has made it almost impossible for me to isolate a voice in a group or (even worse) crowd of people. Maybe in the future, I can be helped by Google because they have developed a ‘deep learning’ system that can recognize specific voices and filter them out, among other things to the faces of people watching.
The model is trained to watch individual people as they talk and thus match the sound with the person and isolate it in a separate audio track. Look at the audio of the two comedians below that are just talking to each other and the AI can pick one out without any problem and only make that sound heard. This works even if the face of the person in question is partially blocked by a microphone or gesticulating hands.
Looking to listen
Google says it is exploring the possibilities to use this technology in their products, but it seems more than clear what this is useful for: Google Home. The clever assistant of Google could even understand a busy party that you say it’s really time to play A-Ha on Spotify or to turn down the volume.
But even in Google’s voice chat apps like Hangouts or Duo, such a function would be very useful, because that way you could focus on someone as a listener. If they go a step further, it could be that with a Google Glass (variant) you can help people who no longer hear well and thus get a focused sound image of the person they are talking to. Throw in voice recognition and you can even get subtitles!
Bye bye privacy
Possibilities are enough, though logically there are considerable privacy problems that can arise from such a system. If someone with this technology can filter your voice from a group of people, you are no longer inaudible even in a busy group for a system that is powerful enough and has enough microphones and cameras. As if the massive surveillance is not enough already.