Text classification 3: Learning word embeddings Flashcards by Savo Simeunovic

What are count-based embedding matrices? What are cons?

Each row represents a word
Each column represents some context in which word can occur
Every entry is a strength of association between i-th word and j-th context

Cons:
- data sparsity: some entries in the matrix may be incorrect because we did not observe enough data
- words have very high dimension (many contexts, possibly thousands or millions). Could tackle which dimensionality reduction

How well did you know this?

Not at all

Perfectly

What are dimensions of each param?

How well did you know this?

Not at all

Perfectly

How well did you know this?

Not at all

Perfectly

Text classification 3: Learning word embeddings Flashcards

(3 cards)