Learning from User Generated Data Flashcards

Question

Can rank correlation coefficients like Kendall's τ or Spearman's ρ be used to compare recommender algorithms?

Answer 1

True ## Footnote These coefficients measure the similarity or agreement between rankings of different algorithms.

Answer 2

* Offline Evaluation: Based on a fixed dataset, easy to reproduce * Online Evaluation: Conducted with real users, high validity * User Studies: Involves surveys and observations ## Footnote Each scenario has its pros and cons regarding realism, effort, and reproducibility.

Answer 3

* Precision@k: Proportion of relevant items among the top-k recommendations * Recall@k: Proportion of relevant items recommended out of all available ## Footnote The choice of metric depends on specific use case and priorities.

Answer 4

* Diversity * Novelty * Coverage * Serendipity * Explainability ## Footnote These metrics evaluate aspects of recommender systems beyond prediction accuracy.

Answer 5

True ## Footnote Content features are more objective and less affected by popularity biases.

Answer 6

To reduce the dimensionality of the vector space ## Footnote These techniques reduce the number of unique terms in a vector space model (VSM).

Answer 7

True ## Footnote Each dimension corresponds to a unique term from the vocabulary.

Answer 8

Converts all text characters to lowercase ## Footnote This can introduce semantic ambiguities.

Answer 9

It is addressed by using cosine similarity ## Footnote Cosine similarity normalizes the vectors to their length.

Answer 10

True ## Footnote Zipf's Law describes the frequency of word occurrences in natural languages.

Answer 11

Binary TF formulation ## Footnote The logarithmic variant offers no significant advantage for rarely repeated words.

Answer 12

A text representation ignoring order, grammar, and syntax ## Footnote Advantages include simplicity and speed; disadvantages include ignoring semantic meanings.

Answer 13

* Noise Removal * Lowercasing ## Footnote These steps clean and standardize texts for processing.

Answer 14

False ## Footnote LSA uses Singular Value Decomposition (SVD), while LDA is a probabilistic model.

Answer 15

Singular Value Decomposition (SVD) ## Footnote This uncovers latent semantic structures.

Answer 16

True ## Footnote This assumption is fundamental to both models.

Answer 17

To describe documents by a set of topics rather than individual words ## Footnote This helps uncover latent semantic structures.

Answer 18

True ## Footnote It uses a bipartite graph of users and items with specific path requirements.

Answer 19

A node metric modeling diffusion behavior in networks ## Footnote It measures the number of paths originating from a node, penalizing longer paths.

Answer 20

True ## Footnote Edges represent interactions between users and items.

Answer 21

Paths that pass through the node of interest ## Footnote They assess a node's importance based on shortest paths.

Answer 22

False ## Footnote They are radial centrality measures, not directly integrated into matrix factorization.

Answer 23

They describe local relationships between two nodes ## Footnote Examples include Tie Strength and Edge Betweenness.

Answer 24

True ## Footnote Weights can be adjusted based on available data.

Answer 25

False ## Footnote It is an example of late-fusion aggregation.

Answer 26

A late-fusion approach ## Footnote In late-fusion, individual recommenders generate recommendations independently, and the aggregation occurs later.

Answer 27

False ## Footnote Monolithic designs typically employ early-fusion, combining data sources before applying the recommendation algorithm.

Answer 28

True ## Footnote Borda rank aggregation combines ranks from multiple recommenders, applying the method after individual lists are created.

Answer 29

Enriching real rating data with predictions from a classifier based on content features ## Footnote CBCF combines content-based and collaborative filtering to create a denser matrix of pseudo-ratings.

Answer 30

False ## Footnote Self-weighting quantifies confidence in the content-based prediction used for generating pseudo-ratings.

Answer 31

* To reflect different facets of the items or the domain * To achieve better results (e.g., improved accuracy) * To enable predictions in situations where a single system might not work ## Footnote Hybrid systems leverage strengths of different approaches to overcome individual limitations.

Answer 32

* Parallelized Design * Monolithic Design * Pipelined Design ## Footnote These categories classify when and how different recommendation approaches are combined.

Answer 33

True ## Footnote User context refers to dynamic aspects of the user's situation during item interaction.

Answer 34

False ## Footnote Item context includes additional data not directly extracted from primary media content, such as tags or album covers.

Answer 35

True ## Footnote This is an example of considering temporal context.

Answer 36

True ## Footnote Explicit context is directly collected, implicit derives from behaviors, and inferential uses existing data.

Answer 37

The purpose that the content creator had in mind when creating the item ## Footnote Item Purpose reflects the creator's intention or function of the item.

Answer 38

False ## Footnote User Background refers to static characteristics, while dynamic needs fall under User Intent or User Context.

Answer 39

False ## Footnote User Intent describes why the user consumes an item, not the creator's purpose.

Answer 40

* Passive User Awareness: Captures context but does not change behavior * Active User Awareness: Automatically integrates new context and adjusts recommendations ## Footnote The difference is in whether context changes trigger immediate adjustments.

Answer 41

True ## Footnote Valence indicates the quality of emotion, while arousal indicates intensity.

Answer 42

True ## Footnote This approach helps in cold-start scenarios by enabling similarity estimation with limited interactions.

Answer 43

True ## Footnote Emotions are specific, intense, and short-lived reactions to stimuli.

Answer 44

False ## Footnote The final score is typically a linear combination of responses.

Answer 45

True ## Footnote This method incorporates personality traits to adjust recommendation diversity.

Answer 46

True ## Footnote It models how frequently and recently used information is more easily retrieved.

Answer 47

* Cognition-inspired Recommender Systems * Personality-aware Recommender Systems * Affect-aware Recommender Systems ## Footnote PIRS utilizes psychological insights to refine recommendations.

Answer 48

It describes the decline in memory performance over time ## Footnote In recommender systems, it models how user interests lose relevance, suggesting a time-based decay weighting for interactions.

Answer 49

True ## Footnote These strategies address bias through various processing techniques.

Answer 50

True ## Footnote This is known as popularity bias or the 'rich-get-richer' effect.

Answer 51

False ## Footnote Individual fairness and group fairness are distinct concepts; one does not guarantee the other.

Answer 52

Popularity bias is known as the 'rich-get-richer' effect, where already popular content tends to be recommended more frequently, reinforcing the popularity distribution. ## Footnote This phenomenon can lead to a lack of diversity in recommendations.

Answer 53

False ## Footnote Individual fairness and group fairness are different concepts. Achieving group fairness does not guarantee individual fairness for every user.

Answer 54

42% ## Footnote The recommender system attempts to mitigate societal bias by striving for a 50% representation of female artists in recommendations.

Answer 55

True ## Footnote Post-processing strategies are techniques applied after the recommendation algorithm has generated its initial results.

Answer 56

* Societal Bias: Discrepancy between the ideal world and reality (e.g., equal representation of genders). * Statistical Bias: Discrepancy between reality and its representation in the system or model. ## Footnote Societal bias concerns external norms, while statistical bias relates to data representation and model building.

Answer 57

* Distributional Harm: Unjust denial of resources or advantages to a person or group. * Representational Harm: Misrepresentation or encoding of stereotypes in the system. ## Footnote These harms impact social and ethical dimensions beyond system performance.

Learning from User Generated Data Flashcards

(81 cards)