POS Flashcards

1
Q

What does POS stand for?

A

Parts of Speech

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How did POS originate?

A

It originated around 100BC when Dionysius Thrax of Alexandria was attempting to summarise Greek linguistic knowledge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What can POS also be called?

A

Word classes, morphological classes, lexical tags

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are POS generally assigned to?

A

Individual words or morphemes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is labelling POS known as?

A

POS Tagging

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are proper names?

A

A proper name is called a Named Entity, and can be a multi-word phrase. Labelling named entities is called Named Entity Recognition (NER)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why are POS and NEs useful?

A

POS gives us clues to neighbouring words and syntactic structure. POS tagging is a key aspect of parsing natural language. NER is important to many natural language tasks such as question answering, stance detection and information extraction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What type of task is POS?

A

It is a sequence labelling task

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the input, output and their lengths for POS labelling?

A

The input X is a sequence of words, the output Y is a sequence of POS tags and the length of X and Y are equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a closed class?

A

These are classes that have fixed membership. These are typically function words used for structuring grammar (of, it, and, you)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is an open class?

A

These have open membership. These are typically nouns, verbs, adjective, adverb, interjections. Language changes over time, new vocabulary emerges and words may take new meanings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a list of POS labels called?

A

It is called a tagset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are some popular tagsets?

A

Penn Treebank

Brown Corpus

C7 Tagset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is POS tagging?

A

It is the process of assigning a POS tag to each word in a text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why is POS Tagging a disambiguation task?

A

Words are ambiguous and can have more than different POS, so we need to find the correct tag for the given situation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are some ways you can build a POS tagger?

A

Rule-based taggers (hand crafted disambiguation rules)

Transformation-based taggers (supervised learning of tagging rules + some hand crafted templates)

Hidden Markov Models (HMM)

Conditional Random Fields (CRF)

17
Q

What is a Markov chain?

A

It models the probability of a next state given the current state

18
Q

What assumption is held with the Markov chain?

A

That the future depends only on the current state (not past)

19
Q

What does the image show?

A

It shows the Markov assumption, where the future depends only on the current state

20
Q

What is a state in a Markov Model?

A

It is a word, and we have a sequence of state variables, so a sequence of words

21
Q

Explain what is shown in the image

A

It shows a Markov chain. In Figure b, we can say that the probability of the word are, given the word uniformly, is 0.4.

P (are | uniformly) = 0.4

22
Q

Why is a basic Markov Model different to a Hidden Markov Model?

A

A basic model requires all events (words and tags) to have been observed

23
Q

Explain what the image shows.

A

It shows in a Markov model we have a set Q, that has is a set of N states

We have a transition probability matrix, A, which represents the probability of moving from state i to state j

We have the initial probability distribution, which gives the probability of a state occurring at the start, as we will start from nothing

24
Q

What is the idea of the Hidden Markov Model?

A

POS tags are hidden states, which we must infer from the observed words.

States are now POS tags, and words represent observations, which we can see

25
Q

What is the Markov Assumption applied to POS?

A

That the next tag (state) depends only on the current tag (state)

26
Q

What is the output independence assumption?

A

It is that the probability of the observation (word) depends only on the current state (tag sequence)

27
Q

What does the image show?

A

It shows in a HMM, we have a set of states, Q

We have a transition probability matrix, A, which is the probability of the next tag, given the current tag

Matrix B is a matrix of observation likelihoods, which is the probability of a word (observation) given a state (tag sequence)

We still have the initial probability distribution and we also have a sequence of observations, which are words

28
Q

What does decoding do in POS Tagging?

A

Given a model and a sequence of observations, decoding aims to find the most probable sequence of states

29
Q

When decoding, what are some assumptions we make?

A

The probability of a word depends only on tag (independent on neighbours)

Probability of tag depends only on previous tag (bigram)

30
Q

What is the Viterbi Algorithm?

A

It is an efficient way for HMM decoding and is done using dynamic programming

31
Q

What is the full equation for a HMM?

A

The image shows the equation. The emission is the probability of a word given a state (POS tag), and the transition is the probability of a POS tag given the previous POS tag (or states)