Association FINAL Flashcards

(40 cards)

1
Q

Association Rules interested in

A

Observing which objects occur together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Association rules recommending or co-occur?

A

Seeing which items co-occur

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Association Rule Mining

A

Given a set of transactions, find the rules that will predict the occurrence of an item based on the occurrences of other items in the transaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Does implications mean casuality?

A

No, means co-occurrence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

{} -> {}

A

Antecedent -> Consequent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

3 types of database

A

Binary, Transaction, Vertical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Items

A

I = {x1, x2, …, xm}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A set X within the set of items

A

Itemset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

An itemset of cardinality k

A

k-itemset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

I^(k)

A

set of all k-itemsets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Transaction identifiers, tids

A

T = {t1, t2, …, tn}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

t within T

A

tidset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Transaction

A

Tuple in the form (t, X) where t is a unique transaction identifier and X is an itemset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Support

A

The support of an itemset X in a dataset D denoted sup(X, D) is the number of transactions in D that contain X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Relative Support

A

The relative support of X is the fraction of transactions that contain X: sup(X,D)|D|

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

We use F to

A

denote the set of all itemsets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

We use F^(k)

A

to denote the set of k-itemsets

18
Q

Itemset mining problem

A

Given a minimum support threshold (minsup), find all itemsets X s.t. sup(x) >= minsup

19
Q

Frequent itemsets

A

An itemset X is frequent if sup(x) >= minsup where minsup is a user specified minimum support threshold (if minsup is fraction, then relative support is implied)

20
Q

Total possible subset

21
Q

Naive approach to generate all itemsets that are frequent

A

For all x in I:
compute support
if support >= minsup
add to list

22
Q

The brute force method

A

Explores the entire itemset search space, regardless of minsup

23
Q

Goal of Association Rule Mining

A

Given a set of transactions T, find all the rules having:
support >= minsup
confidence >= mincond

24
Q

Apriori principle

A

If an itemset is frequent, then all of its subsets must be frequent as well

25
Apriori principle 2
If an itemset if infrequent, then all of its supersets must be infrequent as well
26
A rule is frequent if
the itemset XY is frequent, sup(XY) >= minsup
27
A rule is strong if
conf >= minconf
28
Rules are pruned using
confidence
29
confidence (x->y)
sup(XY)/sup(x)
30
Unlike support, confidence does not exhibit
the monotone property
31
If a rule x -> y\x does not satisfy the confidence threshold, then
any rule x'->y\x', where x' within X, must not satisfy the confidence threshold as well
32
What happens if misnup is too high
we may miss interesting low-support items ex: such items may correspond to expensive products that are rarely purchased by customers, but whose patterns are interesting to mine for the retailer
33
What happens if minsup is too low
We get information overload: too many frequent itemsets and too many spurious rules
34
How can some high confidence rules be misleading?
High confidence might not imply a meaningful relationship if the consequent is already common in the dataset, irrespective of the antecedent
35
Confidence measure ignores
the support of the itemset appearing in the rule consequent
36
which metric accounts for the consequent
lift
37
lift
conf(x->y)/rsup(y)
38
value of lift close to 1 implies
that the support of the rule is expected
39
Good lifts, bad lifts
>1, <<1
40