exam1 Flashcards
(16 cards)
KDD
knowledge discovery in databases
business intelligence main definition
extract INFORMATION for large amount of data
corelation vs causation
just because the curves fit doesnt mean they are related
patterns
probabilities associated with a given fact. summarization of the data
association rules
identify what goes with what
classification
like clustering… separate people into groups. difference is groups PREDICTIVE not known
clustering
separate data instances into groups
collaborative filtering
netflix reccommender. look for similar users and predict based on that
ensemble
use many different models fused together to get the best results
reports
summarization or visualization of data
KDD process steps
DATA selection preprocessing data mining interpretation KNOWLEDGE
data preprocessing
cleaning, exploration, data reduction/transformation
data cleaning
detect and fix/remove bad data
data transformation
convert data format
data mining
searching for patterns of interest
data mining cycle
data -> information -> action -> VALUE