Organizing, Describing and Visualizing Data Flashcards

1
Q

Values that can be counted or measured are called _____ data.

A

Numerical (or Quantitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Discrete and Continuous data are types of _____ data.

A

Numerical (or Quantitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data that is countable, such as the months, days, or hours in a year is called _____ data.

A

Discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data that can take any fractional value is called _____ data.

A

Continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Data that consist of labels that can be used to classify a set of data into groups is called _____ data.

A

Categorical (or Qualitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Nominal and ordinal data are type of _____ data.

A

Categorical (or Qualitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Labels that cannot be placed in order logically is called _____ data.

A

Nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data that can be ranked in a logical order is called _____ data.

A

Ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A a set of observations taken periodically, most often at equal intervals over time is called _____.

A

Time series

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A set of comparable observations all taken at one specific point in time is called _____.

A

Cross-sectional data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The combination of time series and cross-sectional data, often presented in tables is called _____.

A

Panel data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Time series, cross-sectional, and panel data, organized in a defined way, are examples of _____ data.

A

Structured data (ex: market data, fundamental data, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Information that is presented in a form with no defined structure is refered to as _____ data.

A

Unstructured data (ex: management commentaries, must be transformed into structured data to be analyzed)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A time series is an example of a _____ array.

A

One-dimensional

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

For any frequency distribution, the interval with the greatest frequency is referred to as the _____ interval.

A

Modal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The _____ frequency is the percentage of total observations falling within each interval.

A

Relative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

The _____ frequency is the number of observations falling within an interval.

A

Absolute

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

A _____ is a two-dimensional array with which we can analyze two variables at the same time.

A

Contingency table (ex: Accidents by intersection and day of week)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

One kind of contingency table is a 2-by-2 array called a _____.

A

Confusion matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

To analyze three variables at the same time, an analyst can create a _____.

A

Scatter plot matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

The most effective chart types for visualizing RELATIONSHIPS are _____.

A

Scatter plots, scatter plot matrices, and heat maps

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

The most effective chart types for COMPARING CATEGORIES are _____.

A

Bar charts, tree maps, and heat maps

23
Q

The most effective chart types for COMPARING OVER TIME are _____.

A

Line charts, dual-scale line charts, and bubble line charts

24
Q

The most effective chart types for visualizing DISTRIBUTIONS of NUMERICAL DATA are _____.

A

Histograms, frequency polygons, and cumulative distribution charts

25
The most effective chart types for visualizing DISTRIBUTIONS of CATEGORICAL DATA are _____.
Bar charts, tree maps, and heat maps
26
The mean that excludes a stated percentage of the most extreme observations (ex: discard the lowest 0.5% and the highest 0.5% of the observations) is called the _____ mean.
Trimmed
27
The mean that substitute a value for the highest and lowest observations is called the _____ mean.
Windsorized
28
The trimmed and winzorized means are used to control for _____.
Outliers
29
The midpoint of a data set when the data is arranged in ascending or descending order is called the _____.
Median
30
The value that occurs most frequently in a data set is called the _____.
Mode
31
The mean to use for estimating the next observation, expected value of a distribution is the _____ mean.
Arithmetic
32
The mean to find the compound rate of returns over multiple periods is the _____ mean.
Geometric
33
The mean to use for estimating the mean without the effects of a given percentage of outliers is the _____ mean.
Trimmed
34
The mean to use for estimating the mean while decreasing the effects of a given percentage of outliers is the _____ mean.
Winzorized
35
The mean to use to calculate the average share cost from periodic purchases in a fixed dollar amount is the _____ mean.
Harmonic
36
The difference between the third quartile and the first quartile (25th percentile) is known as the _____.
Interquartile range
37
To visualize a data set based on quantiles, we can create a _____ plot.
Box and whisker
38
The _____ is the distance between the largest and the smallest value in the data set.
Range
39
The average of the absolute values of the deviations of individual observations from the arithmetic mean divided by the sample size is called the _____.
Mean absolute deviation (MAD)
40
The coefficient of variation (CV) is computed as the _____ of X divided by the _____ of X.
Standard deviation, Average value
41
One measure of downside risk that involves choosing a target value against which to measure each outcome and only include deviations from the target value is called _____.
Target downside deviation (or Target semideviation)
42
_____ refers to the extent to which a distribution is not symmetrical.
Skewness (or Skew)
43
For a _____ distribution, the mean, median, and mode are equal.
Symmetrical
44
For a positively skewed, unimodal distribution, the _____is less than the _____, which is less than the _____.
Mode, median, mean
45
Among median, mean, and mode, the _____ is the most affected by skewness.
Mean
46
_____ is a measure of the degree to which a distribution is more or less peaked than a normal distribution.
Kurtosis
47
LEPTOKURTIC describes a distribution that is _____ peaked than a normal distribution, whereas PLATYKURTIC refers to a distribution that is _____ peaked than a normal distribution. (Fill the blanks with ''more'' or ''less'')
More, less
48
A LEPTOKURTIC return distribution will have _____ returns clustered around the mean and _____ returns with large deviations from the mean. (Fill the blanks with ''more'' or ''less'')
More, more
49
A distribution is said to exhibit _____ if it has either more or less kurtosis than the normal distribution.
Excess kurtosis
50
Excess kurtosis = Sample kurtosis − X Find ''X''.
3
51
_____ is a measure of HOW two variables move together.
Covariance
52
_____ measures the STRENGHT of the linear relationship between two random variables.
Correlation
53
_____ correlation refers to correlation that is either the result of chance or present due to changes in both variables over time that is caused by their association with a third variable.
Spurious