06 - Collaborative Filtering and Matrix Factorization Flashcards

1
Q

What are the problems of Collaborative Filtering?

A
  • You need many ratings for every user and every movie
  • Just a very simple view: Movies, user, ratings and nothing more behind
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What criteria can we use to decide if a user likes a movie or not?

A
  • Users have certain preferences
  • Movies fulfil these preferences to a certain extent
  • Movies have general features such as genre, actors, director, budget, running time, etc.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How could one calculate the user-movie-preference?

A

The user-movie-preference could be seen as a linear combination of the scores of the individual features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the movie features and user preferences ultimately in matrix factorization in relation to collaborative filtering?

A
  • The features are not real, they are so-called hypothetical “latent features” or “learned features”
  • The number of these features is a hyperparameter
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Could one create the matrices with the movie features or the user preferences be created manually?

A
  • Movie features with a lot of effort possibly possible but rather not
  • User preferences completely impossible
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can you automatically create matrices with movie features and user preferences?

A
  • With (stochastic) gradient descent
  • Initialise both matrices with random small values
  • Update each entry in both matrices with the following formula until a certain criterion (runtime, threshold) is met
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is factorization?

A

Divide a complex product into smaller, and often simpler, factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Matrix Factoristion?

A

A complex matrix is divided into two smaller, simpler matrices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is an advantage of factorized matrices?

A

They need less memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What do you do with missing entries in matrix factorization in relation to collaborative filtering?

A
  • One possibility is to fill them with zeros or -1, but then you would waste a lot of time optimizing the algorithm to predict zeros
  • Therefore: Optimise only for known ratings
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the options for evaluating whether the algorithm is good and which of them is the better choice?

A
  • Option 1: Remove complete movies and their ratings from the matrix and put them into a separate test set
  • Option 2: Assign random entries from the matrix to the test set
  • Option 2 is the better one. With option 1 you have the cold start problem, how can you make a prediction for a movie for which you had no information in the training process?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly