Text pre-processing - Week 2 Flashcards

1
Q

Text pre-processing

A

Document level preparation
- Document conversion
- Language / Domain identification

Tokenisation
- Case Folding

Basic lexical pre-processing
- Lemmatisation
- Stemming
- Spelling corrections

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Case folding

A

Make everything the same case:
Parsnips -> parsnips

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Lemmatization of parsnips

A

parsnip

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Stemming of automated

A

automat

How well did you know this?
1
Not at all
2
3
4
5
Perfectly