Back in May 2019, Google announced their new end-to-end AI translation tech called Translatotron.
They described their new tool as
“based on a sequence-to-sequence network which takes source spectrograms as input and generates spectrograms of the translated content in the target language.”
And while the reception has been overwhelming, the new speech translator is in its early days as not much is known about it.
Google has an iffy reputation with speech translations. Google Translate may have advanced since its launch but is often described as rough, broken, and sometimes useless. Awkward language translations have often become a joke on the internet. Not to mention, other translators exists often to counter Google Translate’s inability to produce good grammar. Which makes us question: “How would Google’s Translatotron fare against speech translations?” It’s interesting considering how the process of translation works.
The Humanity in AI Translation
Previously, we have reported on the rise of voice search marketing and how it’s changing digital marketing and Search Engine Optimization (SEO). The current SEO practice uses chopped up words that would never make sense in a sentence (often a joke among us copywriters). That’s because we use Google Search in a quirky way. When we type of whatever we’re searching, we don’t type whole sentences. It takes too much time and even if we type loose fragments, a search comes up anyway.
With voice search, whether it’s Alexa, Siri, Bixby or Cortana, we use full sentences. “Siri, what’s the weather forecast today?” “Alex, play Despcaito”. These sentences have a different tone and grammar compared to Google Search. Voice search can lead to different intent depending on how words were said. Because humanity in language is present when we speak.
Google Translate, Google’s text AI translator has been criticized for often making incomplete or wrong translation between languages. The resulting language often end up requiring a native speaker to correct the grammar and word usage. That’s because when we talk like how we talk to Alexa, we talk like humans. The humanity in AI translation is often missing.
AI Translation on Lost Languages
Recently, AI researchers tried to translate a language from 3500 years ago. The hypothesis was if AI could translate modern languages, it could translate previous languages used by ancient civilizations. The tests concluded that the AI performed better when less data was placed on the progenitor compared to the initial thought the AI translation was at its maximum potential by having more data. They called this technique the ‘minimum-cost flow’.
While this applies to AI speech translations, it speaks volumes about the idea that AI can replicate humans by having more data on its progenitor, something humans are biologically capable of doing. This could mean that AI translation works best when used for a specific purpose, rather than hoping one tool could fix everything.
What the Translatotron Promises
According to Google’s official page, the translation from end-to-end for the Translatotron follows this system:
- automatic speech recognition – to transcribe the source speech as text
- machine translation – to translate the transcribed text into the target language, and
- text-to-speech synthesis (TTS) – to generate speech in the target language from the translated text.
It promises the ability to communicate across different languages, translating one speech to another speech. What’s interesting is that the Translatotron uses a spectrogram, a visual representation of the spectrum of sound frequencies. They use these frequencies as their basis for their translations. And that the new AI also “bypasses intermediary text” to allow the translation to become more straightforward.
The goal of the AI is to allow people to talk in different languages without difficulty. Although Google has been ambitious and adamant with their AI translation tools, the result often ends up rough on the edges.
The Future of AI Translation
The research for better AI has not ended. Though we have not received any updates about the Translatotron since its announcement, there was a significant interest among people since then. The future of AI translation is unclear, but surely isn’t bleak. The Translatotron opens up opportunities for Google’s AI translation and who knows, maybe in the future we can finally communicate in perfect harmony—machine and human.