Machine translation is quite difficult, especially between certain pairs of languages that vary greatly in how they handle implied context and intonation. At Google, the current translation system picks out known words and phrases, converts them to the target language, and blindly outputs them. This, unfortunately, ignores how the phrases are structured together.
Google has been working toward a newer system, though. Google Neural Machine Translation (GNMT) considers whole sentences, rather than individual words and phrases. It lists all possible translations, and weighs them based on how humans rate their quality. These values are stored and used to better predict following choices, which should be a familiar concept to those who have been reading up on deep learning over the last couple of years.
This new system makes use of Google's “TensorFlow” library, released to the public last year under a permissive, Apache 2.0 license. It will also be compatible with Google's custom Tensor Processing Unit (TPU) ASICs that were announced last May at Google I/O. The advantage of TPUs is that they can reach extremely high parallelism because they operate on extremely low-precision values.
The GNMT announcement showed the new system attempting to translate English to and from Spanish, French, and Chinese. Each pairing, in both directions, showed a definite increase, with French to English almost matching a human translation according to their quality metric. GNMT is currently live to the public when attempting to translate between Chinese and English, and Google will expand this to other languages “over the coming months”.
Reaching human level of
Reaching human level of translation means jack and shit! It’s impossible to watch an English TV show translated with Danish subtitles without it being full of errors and interpretations, when not even people doing this for a living can get it right, I really don’t want to know how bad the average person is at translating! I don’t speak mandarin but Google translate sucks at Danish and you can see it right away when a news site has used it! I pity the fool relying on subtitles! Wish I spoke all languages, I just know I lose so much reading translations!
Google is all about the ads
Google is all about the ads and nothing more!
It’s easy to translate words,
It’s easy to translate words, less easy to translate sentences, and almost impossible to translate concepts that simply don’t exist in other cultures. Sometimes you’ve just got to experience it to get it.