Neural Machine Translation (NMT) continues to be a hot topic in localization industry.
The technology has seen dramatic improvement in a short amount of time, continuing to refine its output with ongoing technological advancements. Given its efficiency and superior quality when compared to previous forms of Statistical Machine Translation (SMT), the hype surrounding NMT is difficult to ignore.
What does this mean for the future of translation? Is it possible that NMT can replace human translators in the near future?
Not so fast.
Although it boasts surprisingly accurate translations, NMT still has its shortcomings. These flaws were evidenced in a competition earlier this year that saw human translators outperform NMT systems.
The main trouble the NMT applications ran into was their inability to interpret context, something humans do naturally on a daily basis. The difficulty of understanding and deciphering nuanced content has been an ongoing issue for MT applications, and this showed in the translations they produced.
Organizers of this event even went on to say that the structure of NMT’s translations were “grammatically awkward.”
Given that NMT is unable to decipher context in written text, the outcome of the competition is not so surprising. If we are talking about MT systems generally, there is no doubt that NMT possesses the utmost “potential” among all of them, and it is certainly a useful tool for assisting with large translation projects.
But one important question remains: Will NMT eventually do away with the need for human linguists?
What the Near Future Holds
An MT engine often performs better in high-resource settings. The larger the text corpus for a given language, the better the MT system performs. And this makes sense given the deep-learning system NMT is based on.
So what about languages that are less prevalent and don’t have the same available data sets as more common dialects?
Currently, NMT is able to accommodate languages such as Chinese, Japanese, and Spanish but still not quite able to completely incorporate those that are less commonly spoken.
For one, NMT systems don’t have enough data to learn from to make accurate translations. Researchers have used other closely related languages in the system for better results, but it is still far from perfect.
One of the shortcomings that is preventing NMT from working with more language pairs is that researchers are not able to find and collect sufficient data, especially when working with a language that is endangered or obscure.
On top of that, for systems this complex, it is also a difficult and slow process for an NMT engine to learn new language pairs. It takes a lot of time, and it costs a lot of money.
Given all of these factors and notwithstanding the additional work that is currently being done, current NMT models can only accommodate a few more low-resource languages in their system than they have been able to in the past.
For now, the application is still far from replacing current translation technology, given that it cannot learn these low-resource languages in a short amount of time.
Promising Improvements
In the past, NMT systems had trouble translating rare words or less common vocabularies in a language pair. When the system encountered an uncommon word, it ended up producing clumsy translations most of the time.
To address this, researchers incorporated a “post-processing” step. This allows the NMT system to translate the rare word by using a dictionary after the original message has been computed.
The process essentially allows rare words to be “recognized” and translated in the machine’s system after everything else has been processed. This helps to maintain the grammaticality and fluency of the translated message.
Although the method is still far from perfect, it should not present too difficult of a challenge for NMT in the foreseeable future.
An Impressive Tool, But Still Not Magic
With these improvements, it is natural to think that MT is definitely heading in the right direction. While that might be true, there is one important “concept” NMT systems have yet to conquer, and that is deciphering nuanced context.
NMT remains a great way to produce an overall “good” translation, but it often has trouble understanding context.
Remedying this problem is a daunting task, for researchers must properly “train” NMT systems how to truly think, and not just simply compute and provide a “gist” translation. These systems need to be able to process language in a more sophisticated manner, emulating the ways in which humans understand prose and context.
What’s Next?
The difficulty of NMT understanding context has presented developers one of the greatest challenges, and remains a major roadblock in surpassing the efficacy of human translators. The biggest issues remain how to teach NMT to truly think like a human translator, decipher context and then synthesize that information.
Until then, no matter what promising improvements have been made on NMT, all seems minor compared to what it cannot do.
Despite this setback, practitioners remain optimistic and continue to make progress on the application. But even with continuous development, it is hard to pinpoint exactly when NMT will demonstrate the full capacity to replace human translators, if ever.
Improvements have been made, but for now, they aren’t enough to do away with human translators.