How does Machine Translation (MT) work?

How does Machine Translation (MT) work?

Translation has always played a crucial role in interlingual communication by making it possible for different languages to exchange knowledge and culture. However, the idea of using machines to translate first emerged in the seventeenth century and became a reality in the twentieth. The development movement on MT continues and no one knows the level it might reach.

Hutchins and Somers (1992:3) defined Machine Translation as follows: “The term Machine Translation (MT) is the new traditional and standard name for computerized systems responsible for the production of translation from one natural language into another, with or without human assistance.”

The American Translators Association (1994:109) also gave a similar definition to Machine translation: “MT is the technology whereby computers attempt to model the human process of translating between natural languages”.

Our main question in this post: How does Machine Translation (MT) work?

Behind the process of translation, like Google Translate translation process, there is a complex operation. To understand this operation, we have to understand the process human translators go through while translating. Human translators read and understand the text they have to analyze three major features of the text:

  1. the morphological features,
  2. the grammatical and syntactical features, and
  3. the semantic features.

When human translators successfully analyze the above-mentioned features, they will be able to produce a well-formed and meaningful target text “TT”.

When it comes to MT, the same fundamentals must be existed in the “MT” in order to give the same translation quality human translators give. MT goes through three major operations and each one has sub-operations:

  1. The analysis operation,
  2. the transferring operation, and
  3. the generation operation.

In this post, we are going to focus only on the analysis operation.

The Analysis Operation:

During this operation, MT analyzes the major features mentioned previously. These features take a hierarchy form because they occur one-by-one.

1- Morphological analysis:

Alhmedan (2001) argues that during this process, MT programs use monolingual dictionaries related to the language of the source text “ST” in order to identify the morphological, grammatical, syntactical and semantic features of each word in the ST.

Monolingual dictionaries identify the following:

  • The grammatical category of each word in the ST. For example, if a word is a noun, a verb, an adjective, an adverb, etc.
  • The subcategorization grammatical features of each word in the ST. For example, if a noun is singular or plural, masculine or feminine or if a verb is transitive or intransitive, etc.
  • The semantic features of each word in the ST. For example, if a noun is an abstract or common or if a verb requires an action by a human doer or something else.

2- Grammatical Analysis:

According to the results concluded from the previous analysis process, this process tends to analyze the grammatical features of the phrases and sentences in the ST in order to identify their structures. During this process, MT analyzes three components of the relations that are existed in the sentences/phases of the ST:

  • Sequence:

Alhmedan, (2001) argues that Arabic and English have two different sequence systems. For example, in Arabic, the noun precedes the adjective1, while in English the adjective precedes the noun2.

1-    Arabic example:

جميلة

 

فتاة

(adjective)

ç

(noun)

Jamilaton

 

Fataton

Meaning: (beautiful)

 

Meaning: (girl)

2-    English example:

       A beautiful       

 

girl

(adjective)

è

(noun)

  • Relation:

Alhmedan, (2001) argues that during this process, MT analyzes the relations between categories. For example, a preposition defines the morphological features of the noun that follows it. In this way, the preposition and the noun become one prepositional phrase. Thus, each word has a relation to the word before and after it.

  • Structure:

Alhmedan, (2001) argues that almost every language has a specific system in which it combines words together to form sentences and texts. We are dealing here only with Arabic and English. For example, Arabic tends to start with a verb while English tends to start with a noun.

3- Semantic Analysis:

We have mentioned before a part of the process of identifying the semantic features that occur during the morphological analysis operation, but sometimes this operation gives more than one meaning to a single word leading to vagueness in the structure of the sentence. To avoid this vagueness, MT tries to identify the semantic features of each word in the ST, and that will help MT to limit the characteristics the word. For example, the word (human) has the following semantic features: physical, movable, and alive.
The semantic features are related to some characteristics that are related to the word (human). Humans can eat, walk and talk. However, when it comes to animals, we clearly see that they share the same semantic features with humans but they do not share the same characteristics. For example, monkeys are physical, movable, and alive. They can eat and walk but they cannot talk.
Identifying those features and the characteristics related to them can help MT to avoid vagueness when multi-meanings are given to a word in a sentence.

 

References:

  1. Abed Allah ben Hamad Alhmedan, (2001), An Introduction to Machine Translation, Ale’becan library, KSA.
  2. American Translators Association, 1994, Professional issues for translators and interpreters, John Benjamins Publishing Company, Amsterdam/Philadelphia.