Computer scienceData scienceNLPMain NLP tasksMachine Translation

Introduction to the Machine Translation

2 minutes read

Today, we frequently encounter information in languages that are unfamiliar to us: on the internet, while traveling, watching movies, or reading articles. However, this is rarely a problem, thanks to the convenience of machine translation tools that can instantly provide information in our language.

This topic will introduce you to the fundamental concepts of machine translation, its various applications, and its main types, which include rule-based, statistical, and neural machine translation methods.

What is machine translation?

Machine translation is a subfield of machine learning focused on automating the translation process from one language to another. These systems use algorithms that can automatically translate text, and they can potentially improve future translations by learning from past examples. Generally, from a user's perspective, the machine translation process involves inputting text, which the system then processes to generate a translation in the chosen language.

Contrary to popular belief, machine translation systems rarely translate text in the same way a human would—by directly mapping individual words to their corresponding translations. Instead, modern systems often learn how to translate by studying large datasets containing examples of text in paired languages, without requiring explicit programming. As a result, different machine translation engines can produce varied results for the same input, depending on their training data and learned patterns.

Three people talk in different languages and still understand each other because of translation.

Domains of usage

The most obvious use case for machine translation is in automatic translation engines like Google Translate or DeepL. Besides this, machine translation has many other applications. Here are some of them:

Software and website localization;
International customer support;
Language learning apps and courses;
Consultations with foreign medical professionals;
Traveler's assistants offering visual translations of menus, signs, or product labels.

Types of machine translation

Machine translation has evolved through three main stages: rule-based systems, statistical methods, and neural network-based approaches. Rule-based methods were predominant from the 1970s through the 1990s, while statistical methods gained popularity from the late 1990s to the early 2000s. Neural network-based approaches have been prevalent since the 2010s, although research into this method began almost a decade earlier. Currently, the neural network-based approach is considered the most advanced. However, understanding the mechanisms behind earlier approaches is essential for grasping the differences between them and the kinds of results each can produce. Subsequent sections will provide an overview of all three approaches. For those interested in a deeper dive into either the statistical or neural network-based methods, additional topics are available for further exploration.

Rule-based approach

The rule-based approach relies on language rules and dictionaries, usually compiled by human authors. This method directs how words and phrases from the source language should be mapped to the target language. Although rule-based systems are often effective for specialized domains like science or technology, they struggle with idiomatic expressions and can be poorly suited to translating between diverse language pairs.

In the example below, you'll see how the rule-based approach works: words in English are independently mapped to their German equivalents.

An example of word-by-word mapping of an English sentence into German language.

Statistical approach

Statistical machine translation leverages vast amounts of bilingual data to identify patterns and relationships between words and phrases in different languages. When a user inputs a text, the system evaluates these patterns to determine the most probable translation. Unlike rule-based systems, this method offers greater variability in possible translations and is more adaptable to different language pairs and domains. However, it can struggle with less common sentence structures and expressions.

Some systems even provide users with information on the statistical probability of particular words being translated in a certain way.

An example of an older instance of Google translate that shows probabilities of a word's different translations.

Neural network-based approach

This approach is more advanced than the previous two, as it is able to learn and reproduce language patterns, rather than using direct reference to words and rules or searching for the most likely translation from a vast dataset, even though it uses large corpora for training as well. Neural machine translation models typically feature complex architecture and can handle complex and context-sensitive translations in any language pairs and domains, delivering a coherent and accurate result. Today, it is the prevailing method for machine translation tasks, although it is quite demanding computationally.

Many popular neural network-based machine translation engines include an encoder and a decoder in their architecture. The encoder interprets the input sentence in one language and converts it into a numerical representation. The decoder then uses this representation to produce a coherent translation in the target language.

An example of a neural MT architecture.

Conclusion

Machine translation is a machine learning technology that is widely used in various fields today. There are three main approaches to machine translation: rule-based, statistical, and neural network-based. The rule-based approach relies on the rules and dictionaries of a language. The statistical approach is based on the probability of a text sequence having a certain translation. The neural network-based approach leverages the model's ability to identify and reproduce language patterns, using complex architectures and demanding computational resources.

How did you like the theory?

Report a typo