Association FINAL Flashcards
(40 cards)
Association Rules interested in
Observing which objects occur together
Association rules recommending or co-occur?
Seeing which items co-occur
Association Rule Mining
Given a set of transactions, find the rules that will predict the occurrence of an item based on the occurrences of other items in the transaction
Does implications mean casuality?
No, means co-occurrence
{} -> {}
Antecedent -> Consequent
3 types of database
Binary, Transaction, Vertical
Items
I = {x1, x2, …, xm}
A set X within the set of items
Itemset
An itemset of cardinality k
k-itemset
I^(k)
set of all k-itemsets
Transaction identifiers, tids
T = {t1, t2, …, tn}
t within T
tidset
Transaction
Tuple in the form (t, X) where t is a unique transaction identifier and X is an itemset
Support
The support of an itemset X in a dataset D denoted sup(X, D) is the number of transactions in D that contain X
Relative Support
The relative support of X is the fraction of transactions that contain X: sup(X,D)|D|
We use F to
denote the set of all itemsets
We use F^(k)
to denote the set of k-itemsets
Itemset mining problem
Given a minimum support threshold (minsup), find all itemsets X s.t. sup(x) >= minsup
Frequent itemsets
An itemset X is frequent if sup(x) >= minsup where minsup is a user specified minimum support threshold (if minsup is fraction, then relative support is implied)
Total possible subset
2^|I|
Naive approach to generate all itemsets that are frequent
For all x in I:
compute support
if support >= minsup
add to list
The brute force method
Explores the entire itemset search space, regardless of minsup
Goal of Association Rule Mining
Given a set of transactions T, find all the rules having:
support >= minsup
confidence >= mincond
Apriori principle
If an itemset is frequent, then all of its subsets must be frequent as well