Text classification 3: Learning word embeddings Flashcards

(3 cards)

1
Q

What are count-based embedding matrices? What are cons?

A
  • Each row represents a word
  • Each column represents some context in which word can occur
  • Every entry is a strength of association between i-th word and j-th context

Cons:
- data sparsity: some entries in the matrix may be incorrect because we did not observe enough data
- words have very high dimension (many contexts, possibly thousands or millions). Could tackle which dimensionality reduction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are dimensions of each param?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly