apriori principle Flashcards

Question 1

Q

Which type of learning does the Apriori algorithm fall under?
A) Supervised Learning
B) Unsupervised Learning
C) Reinforcement Learning
D) Semi-Supervised Learning

Answer

A

B – Apriori is an unsupervised learning technique used for pattern discovery.

Question 2

Q

The Apriori algorithm is mainly used for:
A) Classification
B) Regression
C) Clustering
D) Market Basket Analysis

Answer

A

D – Its most common use is in market basket analysis.

Question 3

Q

Which of the following statements best describes the Apriori principle?
A) All subsets of a frequent itemset must also be frequent
B) All supersets of an infrequent itemset must also be infrequent
C) An itemset is frequent if it appears more than once
D) A frequent itemset must always have three or more items

Answer

A

A – Apriori principle: if an itemset is frequent, all of its subsets must be frequent.

Question 4

Q

What is the purpose of the support metric in Apriori?
A) To measure the confidence of a rule
B) To measure the frequency of an itemset in the dataset
C) To measure the strength of correlation
D) To identify the most expensive items in a basket

Answer

A

B – Support measures how often a particular itemset appears in the dataset.

Question 5

Q

If an itemset {A, B, C} is frequent, which of the following must also be frequent?
A) {A, B, C, D}
B) {A, D}
C) {B, C}
D) {B, D, E}

Answer

A

C – All subsets of a frequent itemset must also be frequent (Apriori principle).

Question 6

Q

Which metric is used to evaluate the usefulness of an association rule beyond chance?
A) Support
B) Confidence
C) Lift
D) Leverage

Answer

A

C – Lift indicates the strength of a rule beyond random chance.

Question 7

Q

If the minimum support is set too high, what is likely to happen?
A) Too many rules are generated
B) All itemsets become frequent
C) Rare but interesting patterns may be missed
D) The algorithm will not terminate

Answer

A

C – High support thresholds can exclude rare but potentially useful patterns.

Question 8

Q

In Apriori, what happens at each new iteration (k)?
A) We prune the previous frequent k-itemsets
B) We combine (k-1)-itemsets to generate k-itemsets
C) We calculate accuracy of each rule
D) We increase the confidence threshold

Answer

A

B – At each step, frequent itemsets of size (k-1) are used to generate candidates of size k.

Question 9

Q

Which of the following association rules is invalid, assuming {1, 2, 3, 4} is a frequent itemset?
A) 1 → 2, 3, 4
B) 1 → 3, 4
C) 1 → 1, 2, 3, 4
D) 2 → 3, 4

Answer

A

C – A rule cannot have the same item on both sides of the implication.

Question 10

Q

What is the primary limitation of the Apriori algorithm?
A) It doesn’t support rule generation
B) It requires labeled data
C) It is computationally expensive due to multiple scans of the dataset
D) It only works with numeric data

Answer

A

C – Apriori is slow due to multiple passes over the data and candidate generation.

Question 11

Q

What is the key principle behind the Apriori algorithm when generating frequent itemsets?
A) It considers all possible item combinations regardless of frequency
B) It uses previously found frequent itemsets to generate larger ones
C) It only analyzes single-item transactions
D) It ignores the concept of minimum support

Answer

A

B) It uses previously found frequent itemsets to generate larger ones
Explanation: The Apriori principle states that if an itemset is frequent, all of its subsets must also be frequent. The algorithm expands frequent (k-1)-itemsets to generate k-itemsets

Question 12

Q

In the Apriori algorithm, why are itemsets that do not meet the minimum support threshold eliminated early?
A) To reduce computational complexity
B) Because they have high confidence
C) Because they are likely to form stronger rules
D) To maximize lift values

Answer

A

Answer: A) To reduce computational complexity
Explanation: Apriori prunes itemsets that don’t meet the minimum support to avoid unnecessary computation, since supersets of infrequent itemsets cannot be frequent

Question 13

Q

Which of the following best defines “support” in the context of association rules?
A) The ratio of antecedent occurrences to consequent occurrences
B) The number of transactions containing both antecedent and consequent
C) The probability that the consequent occurs given the antecedent
D) The efficiency of the rule compared to random chance

Answer

A

B) The number of transactions containing both antecedent and consequent
Explanation: Support measures how frequently an itemset appears in the dataset. It’s the proportion (or count) of transactions containing both the antecedent and consequent

Question 14

Q

If an itemset {A, B} is not frequent, what does the Apriori principle suggest about any itemset containing {A, B}?
A) It might still be frequent
B) It will have higher confidence
C) It cannot be frequent
D) It will have a higher lift

Answer

A

C) It cannot be frequent
Explanation: According to the Apriori principle, if an itemset is infrequent, all its supersets are guaranteed to be infrequent

Question 15

Q

Which metric is used to evaluate how much better a rule is at predicting the consequent than random chance?
A) Support
B) Confidence
C) Coverage
D) Lift

Answer

A

Answer: D) Lift
Explanation: Lift measures how much more often the antecedent and consequent occur together than expected if they were independent. A lift > 1 indicates a positive association

Question 16

Q

Given the rule: {Milk} ⇒ {Cookies}, with a confidence of 80% and a lift of 1.2, what does the lift value indicate?
A) Milk and Cookies are purchased together purely by chance
B) There is no association between Milk and Cookies
C) The rule is 20% better at predicting cookie purchases than random chance
D) The rule applies to 80% of transactions

Answer

Study These Flashcards

A

Answer: C) The rule is 20% better at predicting cookie purchases than random chance
Explanation: A lift of 1.2 means the likelihood of buying cookies when milk is purchased is 1.2 times higher than random chance

Question 17

Q

What is the primary computational challenge in the Apriori algorithm?
A) Calculating lift for each rule
B) Generating association rules from frequent itemsets
C) Scanning the database to find frequent itemsets
D) Setting the correct minimum confidence threshold

Answer

Study These Flashcards

A

Answer: C) Scanning the database to find frequent itemsets
Explanation: The Apriori algorithm requires multiple passes over the dataset to find frequent itemsets, making it computationally expensive, especially with large datasets

Question 18

Q

In R, using the arules package, what does the Apriori algorithm generate by default?
A) Rules with multiple items in the consequent
B) Rules sorted by confidence
C) Rules with one item as the consequent (RHS)
D) Only itemsets without generating rules

Answer

Study These Flashcards

A

C) Rules with one item as the consequent (RHS)
Explanation: By default, the Apriori algorithm in arules generates association rules where the consequent (RHS) is a single item, as multi-item consequents are less interpretable and rarer

apriori principle Flashcards

(18 cards)