Organizing, Visualizing, and Describing Data Flashcards

1
Q

Data that is measured or counted

A

Numerical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

2 types of numerical data

A

Continuous and discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data that can be measured and can take on any value in a range of values

A

Continuous numerical data - FV of an investment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Numerical data that result from a counting process

A

Discrete numerical data - the frequency of discrete compounding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

2 types of data

A

Numerical and Categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data that describe a characteristic or quality

A

categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Other names for Numerical and Categorical data

A

Quantitative and Qualitative data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Categorical data not amenable to a logical order

A

nominal - stock sectors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Categorical data able to be logically ordered

A

ordinal data - ratings for investment funds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

science of dealing with collection, analysis, interpretation, and presentation of numerical data

A

statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
  1. the study of how large datasets can be effectively summarized
  2. studies of central tendency and variation of data
A

descriptive statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

making extrapolations, estimates, forecasts about a large group from a smaller group

A

statistical inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

the complete group (objects, persons, items of interest) being studiued

A

population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

a portion of the group being studied

A

sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

parameter vs statistic

A

a descriptive measure of a population vs a sample, respectively (p&s)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

even distance between (consecutive) numbers

comment on zero

A

interval
zero is arbitrary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

multiple data units at a given time

A

cross-sectional data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

one unit of data across multiple time aliquots

A

time-series data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

data that is patterned vs unpatterned

A

structured vs unstructured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

examples of structured data

A

market data - stock prices

fundamental data - financial statement data

analytics - cash flows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

examples of unstructured data

A

produced by individuals - social media, posts, web searches

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

rank measures from more useful to less useful (interval, nominal, ordinal, ratio)

A

ratio, interval, ordinal, nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

format for representing one variable

A

one-dimensional array

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
format for representing more than one variable via rows/columns
two-dimensional array
26
What is another name for a two-dimensional array?
data table
27
another name for a frequency distribution
one-way table
28
tool for summarizing data into groups or bins for display
frequency distribution
29
GICS stands for
Global Industry Classification Standard
30
real or actual frequency
absolute frequency
31
frequency as a percent number of observations
relative frequency
32
interval data where zero is an absolute number
ratio
33
raw data or non-summarized data
ungrouped data
34
data in a frequency distribution
grouped data
35
depiction of frequency distribution
histogram
36
also known as a 2 way table
contingency table
37
displays 2 or more categorical variables
contingency table
38
frequency at the intersection of a particular row and column
joint frequency
39
sums of joint frequencies
marginal frequency
40
2x2 contingency table in matrix form revealing actual and fake predictions within classes
confusion matrix
41
histogram with line graph showing relative frequencies
frequency polygon
42
frequency polygon with cumulative frequencies
ogive
43
circular depiction of data as a percent
pie chart
44
steps of creating a frequency distribution
1. sort into ascending order 2. range 3. choose # of bins (k) 4. bin width = range/k 5. place the observations in the bins 6. construct a table of bins from smallest to largest
45
test of association between 2 categorical variables
chi-square test
46
arranges data by left digit and right digit to present data concentrations
stem and leaf plot
47
used in quality control to tally qualitative issues
pareto chart
48
2-variable numeric chart used to show correlation
scatter plot
49
measures of where data tends to cluster
measures of central tendency
50
mathematical average influenced by outliers
mean
51
middle value in an array, not affected by the magnitude of extreme values
median
52
most frequent value (2 or more frequent values in data,
mode
53
central tendency measurement commonly used with ordinal data
median
54
central tendency measurement commonly used with interval/ratio data
mean
55
central tendency measurement commonly used with nominal data
mode
56
measures of how spread out data is
measures of dispersion
57
sum of the absolute values of differences between observation and sample mean
mean absolute deviation (MAD)
58
sum of the squared differences between the sample and the mean
variance
59
measures variability of the dataset
variance
60
percentage of variation with respect to the mean
coefficient of variation
61
if data is in a roughly normal distribution, than it will be deposited in certain areas
empirical rule
62
describes how much of a distribution is off center
skewness
63
describes relationship of tails of a distribution to its center
kurtosis
64
describe the mean and median in a normal distribution
they are even
65
tall and skinny distribution
leptokurtic
66
wide and flat distribution
platykurtic
67
range between quartiles (50% of the middle distribution)
interquartile range
68
what is coefficient of variation (CV) used for?
to compare datasets with different scales
69
what is population vs sample coefficient of variation
pop CV = sigma / mu sample CV = s / x bar
70