Lecture 5 Flashcards

1
Q

5.1 What is a recommender system?

A

An information filtering system that seeks to predict the ‘rating’ or ‘preference’ that a user would give to an item.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

5.2 Why is missing data an important issue for recommender systems?

A

The more data that is missing, the harder it gets to make an accurate prediction or reccomendation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

5.3 What is collaborative filtering?

A

Make predictions about a user’s missing data according to the behaviour of many other users
– Look at users collective behaviour
– Look at the active user history
– Combine!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

5.4 What is user based methods for collaborative filtering?

A

User Based:
• Achieve good quality in practice
• The more processing we push offline, the better the method scale
• However:
– User preference is dynamic - High update frequency of offline-calculated information
– No recommendation for new users

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

5.5 What is Method 1 and Method 2 for measuring user-user similarity and what are their relative advantages/disadvantages?

A

Method 1:
• Compute mean value for User1’s missing values
• Compute mean value for User2’s missing values
• Compute squared Euclidean distance between resulting vectors

Method 2:
• Compute squared Euclidean distance between vectors, summing only pairs without missing values
• Scale the result, according to percentage of pairs with a missing value
• [No. of pairs / (No. of pairs – No. of pairs with missing values)] x [squared Euclidean distance between pairs with values]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

5.6 When performing user-user similarity, how do you select neighbours and make a prediction of the missing item?

A

• At runtime: Prediction of rating is the (weighted) average of the values from the top-k similar users

• Can make more efficient by computing clusters of users offline
– At runtime find nearest cluster and use the centre of the cluster as the rating prediction
– Faster (more scalable) but a little less accurate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

5.7 How do you model a missing values problem as a matrix factorisation problem, how can it be used to impute missing values and how do you measure the quality of the resulting imputation?

A

Method: Treat the User-Item Rating table R as a matrix. Use matrix factorisation of this Rating Table
Given a matrix R, we can find matrices U and V such that when U and V are multiplied together the resulting matrix is approximately equal to R.

Results: The product of the two factors U and V, has no missing values. We can use this to predict our missing entries.

Error: We can compute the error (squared distance between R and UV). The smaller it is, the better the fit of the factorisation.
If there are missing values in R, ignore these when computing the error.
The prediction error varies across the cells, but taking all missing cells as a whole, the method aims to make predictions with low average error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

5.4 What is item based methods for collaborative filtering?

A
Item Based:
• Search for similarities among items
• All computations can be done offline
• Item-Item similarity is more stable that user-user similarity
– No need for frequent updates
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

5.4 What is matrix based methods for collaborative filtering?

A

Matrix based:
• Treat the User-Item Rating table R as a matrix
• Use matrix factorisation of this Rating Table
• The prediction error varies across the cells (may be lower in one cell compared to the real value and higher in another), but taking all missing cells as a whole, the method aims to make predictions with low average error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly