Exploring Data and R Flashcards
a preliminary exploration of the data to better understand its characteristic.
Data Exploration
are numbers that summarize properties of the data.
Summary Statistics
is the percentage of time the value occurs.
Frequency
is the most frequent attribute value.
Mode
2 MEASURES OF LOCATION
- Mean
- Median
is the most common measure of the location of a set of points.
Mean
alternative of mean since it is very sensitive to outlier.
Median
2 WAYS TO MEASURE SPREAD
- Range
- Variance of Standard Deviation
is the difference between max and min.
Range
is the most common measure of the spread of a set of points.
Variance of Standard Deviation
is the conversion of data into a visual or tabular format so that the characteristics of the data and the relationships among data items or attributes can be analyzed or reported.
Visualization
12 VISUALIATION TECHNIQUES / METHODS
- Representation
- Arrangement
- Selection
- Histogram
- Box Plots
- Two Dimensional Histograms
- Scatter Plots
- Contour Plots
- Matrix Plots
- Parallel Coordinates
- Star Plot
- Chernoff Faces
is a visualization technique which is the mapping of information to a visual format.
Representation
is the placement of visual elements within a display.
Arrangement
is the elimination or the deemphasis of certain objects and attributes.
Selection
usually shows the distribution of values of a single variable.
Histogram
simplified version of a PDF/histogram.
Box Plots
shows the joint distribution of the values of two attributes.
Two Dimensional Histograms
attributes values determine the position.
Scatter Plots
useful when a continuous attribute is measured on a spatial grid. They partition the planes into regions of similar values.
Contour Plots
can plot a data matrix.
Matrix Plots
used to plot the attribute values of high-dimensional data.
Parallel Coordinates
similar approach to parallel coordinate, but axes radiate from a central point.
Star Plot
approach associates each attribute with a characteristic of a face.
Chernoff Faces