Week 10 Flashcards

1
Q

What are desired things when turning text from one natural language into another?

A
<ul>
<li>Meaning preservation</li>
<li>Be fluent and intelligible</li>
<li>Take into account cultural differences</li>
</ul>
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is computer aided human translation?

A

When a computer makes a rough translation that is post-edited by gumans

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a sublanguage?

A

Text where the vocab and grammatical constructions are limited

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why is machine translation hard?

A

Because of the many differences between languages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is typology?

A

The study of differences and similarities between languages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are some things that languages can differ in?

A

<ul>
<li>Morphological divergence (order and quantity)</li>
<li>Differences in word order</li>
<li>Differences in referential density</li>
<li>Lexical divergence</li>
</ul>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is lexical divergence?

A

Variation in how conceptual properties are mapped to specific words

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the two important differences in languages?

A

<ol>
<li>Number of morphemes per word</li>
<li>Degree to which morphemes are segmentable</li>
</ol>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the two types of languages at the extremes of number of morphemes per word?

A

<ul>
<li>Isolating</li>
<li>Polysynthetic</li>
</ul>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe an isolating language and give an example of one

A

Each word generally has one morpheme, no internal morphology of words, with a lot of compounding
Chinese

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe a polysynthetic language and give an example of one

A

A single word can express what we would think of as one sentence
Inuit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the names for the two extremes of how morphemes are segmentable?

A

<ul>
<li>Agglutinative</li>
<li>Fusion</li>
</ul>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe an agglunative language

A

Clean boundaries between morphemes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Give examples of agglunative languages

A

Finnish, Turkish

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe a fusion language

A

One affix can function as several morphemes merged together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Give an example of a fusion language

A

German

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the three common orderings of Subject, Verb, Order?

A
<ul>
<li>SVO</li>
<li>VSO</li>
<li>SOV</li>
</ul>
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Give examples of languages with SVO order

A

English, German

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Give examples of language with VSO order

A

Arabic, Hebrew

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Give eaxmples of languages with SOV order

A

Japanese

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the two types of adpositions?

A

<ul>
<li>Prepositions</li>
<li>Postpositions</li>
</ul>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is an adposition?

A

A POS that usually combines with a NP to express spatial or temporal relations or mark various semantic roles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Describe what it means for a language to be pro-drop

A

Where pronouns, especially as the subject, can be omitted and picked up just from context

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Give examples of languages that are pro-drop

A

Italian, Spanish, Japanese

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

How are pro-drop and referential density related?

A

Low referential density if the language is pro-drop

26
Q

What does it mean for direction to be verb framed?

A

The direction of motion is captured by the verb

27
Q

What does it mean for direction to be satellite framed?

A

The direction of motion is marked on particles, PPs, APs etc

28
Q

Give an example of how different languages may lexically diverge

A

Different languages make finer distinctions between know
nabod = to know somone
gwybod = to know something

29
Q

What is required for translating polysynthetic languages to isolating languages?

A

Translation from morphemes to words

30
Q

What is required for translating SVO to VSO languages?

A

Reordering/global reconstruction

31
Q

What is required for translating between pro drop and referentially dense languages?

A

Correct pronouns to be recovered

32
Q

What is required for translating between verb framed to satellite framed languages?

A

Global reconstructing

33
Q

What is required for translating between lexically diverged languages?

A

Determining appropriate subsense of word

34
Q

What can we ask humans about to evaluate a translation?

A

<ul>
<li>How natural, clear and readable (fluent) the text is</li>
<li>How much of the original information is retrained (adequacy)</li>
<li>Does it fulfil a certain task</li>
</ul>

35
Q

How can you objectively evaluate a machine translation?

A

<ul>
<li>Time spent reading</li>
<li>Keystrokes needed for correction</li>
</ul>

36
Q

What does BLEU stand for?

A

Bilingual evaluation understudy

37
Q

How do BLEU scores measure the quality of MT?

A

Looks at the overlap between gold standard N grams and translation N grams

38
Q

What is interlingua?

A

an artificial language, devised for machine translation, that makes explicit the distinctions necessary for successful translation into a target language, even where they are not present in the source language.

39
Q

What are the four levels on the Vauqouis triangle?

A
<ol>
<li>Direct</li>
<li>Syntactic transfer</li>
<li>Semantic transfer</li>
<li>Interlingua</li>
</ol>
40
Q

Describe direct transfer

A

<ul>
<li>Each word is translated individually</li>
<li>Bilingual dictionary used</li>
<li>No analysis of global syntactic structure and meaning</li>
</ul>

41
Q

What are the positives of direct transfer?

A

Fast

42
Q

What are negatives of direct transfer?

A

Cannot do non local reordering, generally results in poor translations

43
Q

What are the three phases of syntactic transfer?

A
<ol>
<li>Analysis</li>
<li>Transfer</li>
<li>Generation</li>
</ol>
44
Q

What is involved in the transfer stage of syntactic transfer?

A

<ul>
<li>Finding corresponding words (lexical transfer)</li>
<li>Syntactic transfer (checking rules reflecting structural differences between two languages)</li>
</ul>

45
Q

What kind of things can be sorted with semantic transfer?

A

<ul>
<li>Distinguish between the different types of PPs</li>
<li>Idioms</li>
<li>When one word can have different meanings (word sense disambiguation)</li>
</ul>

46
Q

What’s a disadvantage of multi language enviornments?

A

Require quadratically many components where as direct translation only requires one component per pair of languages

47
Q

What is the benefit of interlingua?

A

Requires only linearily many components for multi language environments

48
Q

If F represents a sentence in a foreign language and E represents an english sentence, what probability captures the faithfulness of the translation?

A

P(F|E)

49
Q

If F represents a sentence in a foreign language and E represents an english sentence, what probability captures the fluency of the translation?

A

F(E)

50
Q

Describe alignment in translation of foreign sentences

A

For each foreign word fj, a corresponding position aj = i is given for an English word occurence ei

51
Q

What does the IBM model assume?

A

<ul><li>The probability of the length of the foreign sentence depends only on the length of the length of the English sentence</li><li>Given the length of the English and the foreign sentences, each alignment is equally likely</li><li>Probability of fj depends only on eaj</li></ul>

52
Q

If the English sentence had a length of I, and the foreign sentence had a length of J how many possible alignments can you get?

A

(I+1)^J

53
Q

What are used to train alignment models?

A

Parallel corpus (bitext)

54
Q

What is required for training alignment models that are not usuallly provided?

A

Alignments

55
Q

How is P(f | e) estimated?

A

<ol>
<li>Take initial estimates</li>
<li>Determine expected count of pairs (f,e) and jump widths</li>
<li>New estimate of P(f | e) and probabilities of jump widths</li>
<li>Iterate till convergence</li>
</ol>

56
Q

What is the jump width?

A

The difference between the position of the last foreign word in the English sentence and the currently placed foreign word in the English sentence

57
Q

What is the constraint on the jump width?

A

Has to be between 0 and I

58
Q

If we have an unseen foreign sentence, what is our problem and how can we solve it?

A
Can't enumerate over all possible English sentences
Use a combination of
<ul>
<li>A* search</li>
<li>Beam search pruning</li>
</ul>
to create an Eager guessing method
59
Q

What is beam search?

A

heuristic search algorithm that explores a graph by expanding the most promising node in a limited set

60
Q

Describe a synchronous CFG

A

Consist of two CFGs, one for the source language, one for the target language
Rules of the two CFGs are connected pairwise

61
Q

What does a synchronous CFG model?

A

Syntactic transfer