association comprehensive Flashcards
(20 cards)
Which of the following best describes Association Rules?
A) Predicting future sales trends
B) Identifying patterns of items that occur together in transactions
C) Grouping customers based on demographics
D) Forecasting time series data
B) Identifying patterns of items that occur together in transactions
Explanation: Association Rules focus on discovering relationships like “Customers who bought X also bought Y” by analyzing transaction data
Which of these is NOT a common name for Association Rules?
A) Affinity Analysis
B) Market Basket Analysis
C) Collaborative Filtering
D) Transaction Pattern Mining
C) Collaborative Filtering
Explanation: Collaborative filtering is a separate recommendation technique based on user preferences, not association rule mining
In an association rule, the consequent refers to:
A) The IF part
B) The THEN part
C) Both IF and THEN
D) The itemset with the highest frequency
B) The THEN part
Explanation: The consequent is the “THEN” part of an association rule, while the antecedent is the “IF” part
True or False:
The antecedent and consequent in an association rule can have items in common
False
Explanation: The antecedent and consequent must be disjoint—they cannot share items
Which metric measures how frequently an itemset appears in the dataset?
A) Confidence
B) Lift
C) Support
D) Correlation
C) Support
Explanation: Support indicates how often an itemset occurs within all transactions
A rule {Bread} ⇒ {Butter} has a confidence of 70%. What does this mean?
A) 70% of all transactions include bread and butter
B) 70% of transactions that include bread also include butter
C) Bread and butter are bought together by 70% of customers
D) Butter is 70% more likely to be purchased randomly
B) 70% of transactions that include bread also include butter
Explanation: Confidence measures how often the consequent appears when the antecedent is present
If a rule has a lift = 1, what does this indicate?
A) Strong positive association
B) Strong negative association
C) No association (independence)
D) The rule is invalid
Answer: C) No association (independence)
Explanation: A lift of 1 means the antecedent and consequent occur together just as often as expected by chance
Which of the following is the correct formula for confidence?
A) Support(Antecedent ∪ Consequent) / Support(Antecedent)
B) Support(Consequent) / Support(Antecedent)
C) Lift × Support
D) Support(Antecedent) / Support(Consequent)
A) Support(Antecedent ∪ Consequent) / Support(Antecedent)
Explanation: Confidence is calculated as the proportion of transactions containing both antecedent and consequent, divided by those containing just the antecedent
Why is lift an important measure when evaluating association rules?
A) It tells how frequent the itemset is
B) It shows how much better the rule is compared to random chance
C) It calculates the probability of the antecedent
D) It reduces computational complexity
B) It shows how much better the rule is compared to random chance
Explanation: Lift evaluates the strength of a rule beyond random co-occurrence
What is the main purpose of setting a minimum support threshold in the Apriori algorithm?
A) To improve accuracy of rules
B) To reduce the number of itemsets considered
C) To maximize lift values
D) To increase confidence automatically
Answer: B) To reduce the number of itemsets considered
Explanation: The Apriori algorithm prunes itemsets that don’t meet minimum support to limit computations
The Apriori algorithm starts by generating:
A) All possible item combinations
B) Frequent one-itemsets
C) Rules with maximum confidence
D) The itemset with the highest lift
B) Frequent one-itemsets
Explanation: The algorithm begins by identifying frequent itemsets containing a single item before expanding to larger sets
True or False:
If a two-itemset does not meet the minimum support, any larger itemset containing it will also fail to meet the minimum support.
True
Explanation: This is the core of the Apriori principle—if a subset is infrequent, its supersets must also be infrequent
Each iteration of the Apriori algorithm requires:
A) A full scan of the transaction database
B) Only partial data sampling
C) Calculation of lift values
D) Running regression analysis
A) A full scan of the transaction database
Explanation: Each step in Apriori involves scanning the database to count itemset frequencies
What is a major strength of the Apriori algorithm?
A) It ignores infrequent items
B) It reduces computation by pruning itemsets based on support
C) It guarantees high confidence rules
D) It works without scanning the database
B) It reduces computation by pruning itemsets based on support
Explanation: Apriori improves efficiency by eliminating unlikely itemsets early
What is a key risk when generating a large number of association rules?
A) Missing high-lift rules
B) Discovering patterns caused by random chance
C) Reducing support values
D) Overestimating confidence
B) Discovering patterns caused by random chance
Explanation: Generating too many rules increases the chance of finding meaningless associations
How can you minimize the risk of identifying meaningless rules?
A) Use very low support and confidence thresholds
B) Only consider rules with lift = 1
C) Focus on rules derived from large datasets
D) Generate as many rules as possible
C) Focus on rules derived from large datasets
Explanation: Larger datasets reduce the likelihood of random patterns appearing significant
Which of the following best describes the role of confidence in rule evaluation?
A) It measures how often the rule applies across all transactions
B) It measures how likely the consequent occurs when the antecedent occurs
C) It indicates random chance
D) It identifies the most frequent item
B) It measures how likely the consequent occurs when the antecedent occurs
Explanation: Confidence reflects conditional probability based on the antecedent
Collaborative Filtering is primarily used to:
A) Identify frequent itemsets
B) Recommend items based on similar users’ preferences
C) Calculate support and confidence
D) Eliminate redundant association rules
B) Recommend items based on similar users’ preferences
Explanation: Collaborative filtering suggests items based on behaviors of similar users
Which statement about user-based collaborative filtering is TRUE?
A) It compares products to find associations
B) It finds users with similar preferences to recommend items
C) It only works for transactional data
D) It is less computationally expensive than item-based filtering
B) It finds users with similar preferences to recommend items
Explanation: User-based collaborative filtering focuses on identifying “people like you”
True or False:
Item-based collaborative filtering recommends products by analyzing similarities between items rather than users.
Answer: True
Explanation: Item-based collaborative filtering compares items directly instead of user behavior