Learning from User generated Data Flashcards
(80 cards)
A recommender system creates a list of recommendations based on a user-defined query that expresses the user’s information need. (True/False)
False
A recommender system is an information filtering system that provides a personalized perspective on the available item catalog based on user actions or preferences.
An information retrieval system creates a list of documents based on a user-defined query that expresses the user’s information need. (True/False)
True
This is the fundamental function of an information retrieval system.
Name the main goals pursued by the course ‘Learning from User-generated Data’ regarding recommender systems.
The main goals are:
* To illustrate approaches to learning from user-generated data
* To provide a sense of how recommender systems are used in real-world applications
What are the typical inputs and outputs of recommender systems? Provide examples for each.
Inputs:
* User-item interactions
* User-item ratings
* Personal and item data
Outputs:
* Predicted ratings
* Filtered lists
* Recommendations for the ‘next item’
Explain the difference between explicit and implicit user feedback and give an example for each.
Explicit Feedback:
* Direct indication of preference (e.g., ratings)
Implicit Feedback:
* Inferred preferences from interactions (e.g., frequency of consumption)
What is the main assumption on which Collaborative Filtering (CF) is based?
The main assumption is that users who had a similar taste in the past will have a similar taste in the future.
Describe what an interaction matrix is and why it is a key concept in many recommender systems.
An interaction matrix represents users (rows) and items (columns), with entries showing interactions or ratings. It is key because it forms the basis for many collaborative filtering algorithms.
Explain the difference between Memory-based and Model-based Collaborative Filtering.
Memory-based CF:
* Stores all ratings directly
* Predictions are ad-hoc
Model-based CF:
* Factorizes the user-item matrix
* Predictions are based on a learned model
Item-based CF scales better for several reasons: (1) item-item similarities can be calculated offline and updated from time to time, and (2) only items that the active user has rated are considered when identifying the nearest neighbors. (True/False)
True
Which similarity measures are commonly used in Collaborative Filtering, and why is Adjusted Cosine Similarity relevant for item-based CF?
Common measures:
* Pearson’s correlation coefficient
* Cosine similarity
Adjusted cosine similarity accounts for user-specific rating bias in item-based CF.
A major problem with using Truncated SVD for matrix factorization is missing values in the user-item rating matrix. (True/False)
True
Model-based CF methods learn exclusively from implicit user feedback, which exacerbates cold-start problems. (True/False)
False
When Stochastic Gradient Descent (SGD) is used to create an MF model, a regularization term is used to prevent overfitting. A common choice is Tikhonov regularization. (True/False)
True
What are latent factors in matrix factorization, and how are they used to predict ratings?
Latent factors are dimensions derived from rating patterns representing users and items. Predicted ratings are calculated as the inner product of user and item latent vectors.
Explain the concept of Singular Value Decomposition (SVD) in the context of matrix factorization for recommender systems.
SVD factorizes a matrix into three smaller matrices: user factors (U), singular values (Σ), and item factors (V^T). It reconstructs the original matrix and enables dimensionality reduction.
Consider the following preferences and needs of a user evaluating a new recommender system. What metric corresponds to knowing the precision at the point where precision and recall are equal?
R-precision
Two recommender systems have created lists of 3 recommendations for the same user. Both lists achieve the same score for the Reciprocal Rank. (True/False)
True
The metrics MRR (Mean Reciprocal Rank) and NDCG (Normalized Discounted Cumulative Gain) consider the position of relevant items in the recommendation list. (True/False)
True
Average Precision (AP) considers relevant items that are not in the recommendation list, thus implicitly incorporating recall. (True/False)
True
Compared to CG (Cumulative Gain), DCG (Discounted Cumulative Gain) weights the item gains using the position in the recommendation list. (True/False)
True
NDCG (Normalized Discounted Cumulative Gain) relates DCG to the ideal DCG to increase the interpretability of the DCG values. (True/False)
True
Rank correlation coefficients such as Kendall’s τ or Spearman’s ρ can be used to compare the rankings of 2 (or more) recommender algorithms and determine the extent of their agreement. (True/False)
True
What is the range of scores normalized by recommender systems?
0 to 1
This normalization allows for comparisons between different systems and facilitates the interpretation of results.
Can rank correlation coefficients like Kendall’s τ or Spearman’s ρ be used to compare recommender algorithms?
True
These coefficients measure the similarity or agreement between rankings of different algorithms.