Organizing, Visualizing Data Flashcards

1
Q

Numerical Data (quantitative data)

A

▪️ Continuous data: can take on any numerical value in a specified range of values.

▪️ Discrete data: The data are limited to a finite number of values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Categorical Data (qualitative data)

A

▪️ Nominal data: Categorical values that are not amenable to being organized in a logical order.

▪️ Ordinal data: Categorical values that can be logically ordered or ranked.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Structured Data

A

are highly organized in a pre-defined manner, usually with repeating patterns.
For example:

▪️ Daily closing stock price
▪️ EPS, P/E, dividend yield, ROE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Unstructured Data

A

are data that do not follow any conventionally organized forms
For example:

▪️ Text, social media post
▪️ Corporate regulatory filings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

One-dimensional array

A

is the simplest format for representing a collection of data of the same data type, so it is suitable for representing a single variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

two-dimensional rectangular array

A

also called a “data table”, Similar to the structure in an Excel spreadsheet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Tree-map

A

It consists of a set of colored rectangles to represent distinct groups, and the area of each rectangle is proportional to the value of the corresponding group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Heat-map

A

A type of graphic that organizes and summarizes data in a tabular format and represents them using a color spectrum. Besides their use in displaying frequency distribution and relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Trimmed mean

A

computed by excluding a stated small percentage of the lowest and highest values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Winsorized mean

A

computed by a stated percentage of the lowest values equal to one specified low value, and a stated percentage of the highest values equal to one specified high value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Mode

A

▪️ unimodal
▪️ bimodal
▪️ trimodal

when such data are grouped into bins, however, we often find an interval with the highest frequency (modal interval).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Percentile

A

▪️ quartiles: 4
▪️ deciles: 10
▪️ quintile: 5
▪️ percentile: 100

L = (n+1) y/100
where L: location

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Interquartile Range

A

IQR = Q3 - Q1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Upper / Lower fence of the Box plot

A

▪️ Upper fence = (1.5 x IQR range) + Q3 upper bound
▪️ Lower fence = -(1.5 x IQR range) + Q2 lower bound

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Coefficient of variation

A

CV = s / X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Skewness

A

▪️ positive skewed
mode - median- mean

▪️ negative skewed
mean - median - mode

17
Q

Kurtosis

A

▪️ leptokurtic (fat-tailed) > 3
▪️ platykurtic (thin-tail) < 3