apriori principle Flashcards

1
Q

Which type of learning does the Apriori algorithm fall under?
A) Supervised Learning
B) Unsupervised Learning
C) Reinforcement Learning
D) Semi-Supervised Learning

A

B – Apriori is an unsupervised learning technique used for pattern discovery.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The Apriori algorithm is mainly used for:
A) Classification
B) Regression
C) Clustering
D) Market Basket Analysis

A

D – Its most common use is in market basket analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Which of the following statements best describes the Apriori principle?
A) All subsets of a frequent itemset must also be frequent
B) All supersets of an infrequent itemset must also be infrequent
C) An itemset is frequent if it appears more than once
D) A frequent itemset must always have three or more items

A

A – Apriori principle: if an itemset is frequent, all of its subsets must be frequent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the purpose of the support metric in Apriori?
A) To measure the confidence of a rule
B) To measure the frequency of an itemset in the dataset
C) To measure the strength of correlation
D) To identify the most expensive items in a basket

A

B – Support measures how often a particular itemset appears in the dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

If an itemset {A, B, C} is frequent, which of the following must also be frequent?
A) {A, B, C, D}
B) {A, D}
C) {B, C}
D) {B, D, E}

A

C – All subsets of a frequent itemset must also be frequent (Apriori principle).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which metric is used to evaluate the usefulness of an association rule beyond chance?
A) Support
B) Confidence
C) Lift
D) Leverage

A

C – Lift indicates the strength of a rule beyond random chance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

If the minimum support is set too high, what is likely to happen?
A) Too many rules are generated
B) All itemsets become frequent
C) Rare but interesting patterns may be missed
D) The algorithm will not terminate

A

C – High support thresholds can exclude rare but potentially useful patterns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In Apriori, what happens at each new iteration (k)?
A) We prune the previous frequent k-itemsets
B) We combine (k-1)-itemsets to generate k-itemsets
C) We calculate accuracy of each rule
D) We increase the confidence threshold

A

B – At each step, frequent itemsets of size (k-1) are used to generate candidates of size k.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which of the following association rules is invalid, assuming {1, 2, 3, 4} is a frequent itemset?
A) 1 → 2, 3, 4
B) 1 → 3, 4
C) 1 → 1, 2, 3, 4
D) 2 → 3, 4

A

C – A rule cannot have the same item on both sides of the implication.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the primary limitation of the Apriori algorithm?
A) It doesn’t support rule generation
B) It requires labeled data
C) It is computationally expensive due to multiple scans of the dataset
D) It only works with numeric data

A

C – Apriori is slow due to multiple passes over the data and candidate generation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the key principle behind the Apriori algorithm when generating frequent itemsets?
A) It considers all possible item combinations regardless of frequency
B) It uses previously found frequent itemsets to generate larger ones
C) It only analyzes single-item transactions
D) It ignores the concept of minimum support

A

B) It uses previously found frequent itemsets to generate larger ones
Explanation: The Apriori principle states that if an itemset is frequent, all of its subsets must also be frequent. The algorithm expands frequent (k-1)-itemsets to generate k-itemsets​

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In the Apriori algorithm, why are itemsets that do not meet the minimum support threshold eliminated early?
A) To reduce computational complexity
B) Because they have high confidence
C) Because they are likely to form stronger rules
D) To maximize lift values

A

Answer: A) To reduce computational complexity
Explanation: Apriori prunes itemsets that don’t meet the minimum support to avoid unnecessary computation, since supersets of infrequent itemsets cannot be frequent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Which of the following best defines “support” in the context of association rules?
A) The ratio of antecedent occurrences to consequent occurrences
B) The number of transactions containing both antecedent and consequent
C) The probability that the consequent occurs given the antecedent
D) The efficiency of the rule compared to random chance

A

B) The number of transactions containing both antecedent and consequent
Explanation: Support measures how frequently an itemset appears in the dataset. It’s the proportion (or count) of transactions containing both the antecedent and consequent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

If an itemset {A, B} is not frequent, what does the Apriori principle suggest about any itemset containing {A, B}?
A) It might still be frequent
B) It will have higher confidence
C) It cannot be frequent
D) It will have a higher lift

A

C) It cannot be frequent
Explanation: According to the Apriori principle, if an itemset is infrequent, all its supersets are guaranteed to be infrequent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Which metric is used to evaluate how much better a rule is at predicting the consequent than random chance?
A) Support
B) Confidence
C) Coverage
D) Lift

A

Answer: D) Lift
Explanation: Lift measures how much more often the antecedent and consequent occur together than expected if they were independent. A lift > 1 indicates a positive association​

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Given the rule: {Milk} ⇒ {Cookies}, with a confidence of 80% and a lift of 1.2, what does the lift value indicate?
A) Milk and Cookies are purchased together purely by chance
B) There is no association between Milk and Cookies
C) The rule is 20% better at predicting cookie purchases than random chance
D) The rule applies to 80% of transactions

A

Answer: C) The rule is 20% better at predicting cookie purchases than random chance
Explanation: A lift of 1.2 means the likelihood of buying cookies when milk is purchased is 1.2 times higher than random chance​

17
Q

What is the primary computational challenge in the Apriori algorithm?
A) Calculating lift for each rule
B) Generating association rules from frequent itemsets
C) Scanning the database to find frequent itemsets
D) Setting the correct minimum confidence threshold

A

Answer: C) Scanning the database to find frequent itemsets
Explanation: The Apriori algorithm requires multiple passes over the dataset to find frequent itemsets, making it computationally expensive, especially with large datasets​

18
Q

In R, using the arules package, what does the Apriori algorithm generate by default?
A) Rules with multiple items in the consequent
B) Rules sorted by confidence
C) Rules with one item as the consequent (RHS)
D) Only itemsets without generating rules

A

C) Rules with one item as the consequent (RHS)
Explanation: By default, the Apriori algorithm in arules generates association rules where the consequent (RHS) is a single item, as multi-item consequents are less interpretable and rarer​