Exam 1 Flashcards
(7 cards)
Column Filter Node
Used to select columns for analysis.
oRight click -> configure -> choose columns you want
Statistics Node
Shows univariate statistics and some limited diagrams.
Missing Values Node
Removes rows with missing values.
Missing Value -> Configure -> Column Settings -> Remove Row
Row Filter Node
o Filter to see only one type (Positions-> Catcher)
o Filter to see anything that contains Pitch. * Is a wild character – Must select contains wild card Box
o Filter to retrieve a range
-Catchers between 25-35, need 2 separate nodes (cant combine)
Nominal Value Row Filter Node
Lets you pick one nominal column and filter by selected categories in that variable.
-ex: (Positions-> Catcher, Outfielder, Shortstop)
Types of Missing Values
Nominal Categorical
1) Missing completely at random (MCAR)
2) Missing not at random (MNAR)
How do you handle missing data?
1)Study the reason for missing
The data could be missing because of MNAR.
2)Ignore it (still keep the record in data set)
This may not be a wise decision because some DM techniques are very sensitive to missing data.
3)Pairwise deletion of rows: an observation with missing value for variable X is removed from statistics (such as correlation matrix) involving the variable X. It is not removed from statistics not involving X.