Machine translation is quite difficult, especially between certain pairs of languages that vary greatly in how they handle implied context and intonation. At Google, the current translation system picks out known words and phrases, converts them to the target language, and blindly outputs them. This, unfortunately, ignores how the phrases are structured together.
Google has been working toward a newer system, though. Google Neural Machine Translation (GNMT) considers whole sentences, rather than individual words and phrases. It lists all possible translations, and weighs them based on how humans rate their quality. These values are stored and used to better predict following choices, which should be a familiar concept to those who have been reading up on deep learning over the last couple of years.
This new system makes use of Google's “TensorFlow” library, released to the public last year under a permissive, Apache 2.0 license. It will also be compatible with Google's custom Tensor Processing Unit (TPU) ASICs that were announced last May at Google I/O. The advantage of TPUs is that they can reach extremely high parallelism because they operate on extremely low-precision values.
The GNMT announcement showed the new system attempting to translate English to and from Spanish, French, and Chinese. Each pairing, in both directions, showed a definite increase, with French to English almost matching a human translation according to their quality metric. GNMT is currently live to the public when attempting to translate between Chinese and English, and Google will expand this to other languages “over the coming months”.