Association Analysis Flashcards

1
Q

a collection of one or more items

A

Itemset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

fraction of transaction that contain an itemset.

A

Support

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

an itemset whose support is greater than or equal to minsup (minimum support) threshold.

A

Frequent Itemset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

an implication of the form X -> Y where X and Y are itemset.

A

Association Rule

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

2 Rule Evaluation Metrics for Association Analysis

A
  1. Support
  2. Vector
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

fraction of transaction that contain both X and Y

A

Support

{Milk, Bread} -> {Diaper}

Formula:
{Milk, Bread, Diaper} / # of List

In the example:
2 / 5 has {Milk, Bread, Diaper}

Support = 0.4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

measures how often items in Y appears in transaction that contain X.

A

Confidence

{Milk, Bread} -> {Diaper}

Formula:
{Milk, Bread, Diaper} / {Milk, Bread} (# of occurrence)

In the example:
2 / 3 (3 instances where {Milk, Bread} is in the list)

Confidence = 0.67

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Given a set of transactions, the goal of association rule mining is to find all rules having:
o Support >= minsup
threshold

  o Confidence >= 
   minconf threshold
A

Association Rule Mining Task

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

an approach where we list all possible association rules, compute the support and confidence for each rule, prune rules that fail minsup and minconf threshold.

A

Brute Force Approach

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The 2 Step Approach in Mining Association Rules

A
  1. Frequent Itemset Generation
  2. Rule Generation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

generate all itemset whose support >= minsup

A

Frequent Itemset Generation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

3 Generation Strategies in Frequent Itemset Generation

A
  • Reduce number of candidates
  • Reduce number of transactions
  • Reduce number of comparisons
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

a principle that states if an itemset is frequent, then all of its subset must also be frequent.

A

Apriori Principle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

a process of scanning database of transaction to determining the support of each candidate itemset.

A

Candidate Counting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

This is where we store candidate to reduce number of candidates.

A

Hash Structure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Is an approach in mining association rules where we generate high confidence rules from each frequent itemset, where each rule is a binary partitioning of frequent itemset.

A

Rule Generation

17
Q

4 Factors Affecting Complexity in Association Analysis

A
  1. Choice of minimum support threshold.
  2. Dimensionality (# of items) of the dataset
  3. Size of database
  4. Average transaction width.