Vol. 1 LM2 Data Types Flashcards

1
Q

Concept

are data that can be measured or counted quantities as a number

p. 61

A

numerical data
OR
quantitative data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Describe

numerical data

p. 61

A

are data that can be measured or counted quantities as a number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Concept

are data that can be measured and can take on any numerical value in a specified range of values

A

continuous data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Describe

continuous data

p. 61

A

are data that can be measured and can take on any numerical value in a specified range of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Concept

are numerical values that result from a counting process.

p. 61

A

discrete data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe

discrete data

p. 61

A

are numerical values that result from a counting process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Concept

are categorical values that are not amenable to being organized in a logical order

p. 61

A

nominal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe

nominal data

p. 61

A

are categorical values that are not amenable to being organized in a logical order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Concept

are categorical values that can be logically ordered or ranked

p. 62

A

ordinal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

identify data type

Cash dividends per share paid by a public company. Note that cash divi- dends are a distribution paid to shareholders based on the number of shares owned.

p. 63

A

Cash dividends per share are continuous data since they can take on any non-negative values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Identify data type

Credit ratings for corporate bond issues. As background, credit ratings gauge the bond issuer’s ability to meet the promised payments on the bond. Bond rating agencies typically assign bond issues to discrete categories
that are in descending order of credit quality (i.e., increasing probability of non-payment or default).

p. 63

A

credit ratings are ordinal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Identify data type

Hedge fund classification types. Note that hedge funds are investment ve- hicles that are relatively unconstrained in their use of debt, derivatives, and long and short investment strategies. Hedge fund classification types group hedge funds by the kind of investment strategy they pursue.

p. 63

A

Hedge fund classification types are nominal data. Each type groups together hedge funds with similar investment strategies. In contrast to credit ratings for bonds, however, hedge fund classification schemes do not involve a ranking. Thus, such classification schemes are not ordinal data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Another data classification standard is based on how data are collected, and it cate- gorizes data into three types

p. 63

A
  1. cross-sectional data
  2. time series data
  3. panel data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Concept

is a characteristic or quantity that can be measured, counted, or categorized and is subject to change.

p. 63

A

variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe

variable

p. 63

A

is a characteristic or quantity that can be measured, counted, or categorized and is subject to change

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Concept

are a sequence of observations for a single observational unit of a specific variable collected over time and at discrete and typically equally spaced intervals of time

p. 64

A

time-series data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Describe

time-series data

p. 64

A

are a sequence of observations for a single observational unit of a specific variable collected over time and at discrete and typically equally spaced intervals of time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Concept

are a list of observations a specific variable from multiple observational units

p. 64

A

cross-sectional data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Describe

cross-sectional data

p. 64

A

are a list of observations a specific variable from multiple observational units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Concept

  • are a mix of time-series and cross-sectional data that are frequently used in financial analysis and modeling.
  • These data consist of observations through time on one or more variables for multiple observational units

p. 64

A

panel data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Concept

the observational data in this data type are usually organized in a matrix format called a data table

p. 64

A

panel data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Concept

are highly organized in a pre-defined manner, usually with repeating patterns

p. 64

A

structured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Describe

structured data

p. 64

A

are highly organized in a pre-defined manner, usually with repeating patterns

24
Q

Concept

typical format of this type of data is a one-dimensional array or a two-dimensional table or matrix

p. 64

A

structured data

25
Q

Concept

are data that do not follow any conventionally organized forms, such as financial news or company filings.

p. 65

A

unstructured data

26
Q

Describe

unstructured data

p. 65

A
  • are data that do not follow any conventionally organized forms
  • DAGs are format for dealing with unstructured data
  • JSONs for semi-structured data
27
Q
  1. Which of the following is most likely to be structured data?

A. Social media posts where consumers are commenting on what they
think of a company’s new product.
B. Daily closing prices during the past month for all companies listed on Japan’s Nikkei 225 stock index.
C. Audio and video of a CFO explaining her company’s latest earnings announcement to securities analysts.

p. 67

A

B. Daily closing prices represent structured time-series data

28
Q

Which of the following statements describing panel data is most accurate?

A. It is a sequence of observations for a single observational unit of a specific variable collected over time at discrete and equally spaced intervals.
B. It is a list of observations of a specific variable from multiple observational units at a given point in time.
C. It is a mix of time-series and cross-sectional data that are frequently used in financial analysis and modeling.

p. 67

A

C. it is a mix of time-series and cross-sectional data

29
Q

Which of the following data series is least likely to be sortable by values?

A. Daily trading volumes for stocks listed on the Shanghai Stock
Exchange.
B. EPS for a given year for technology companies included in the S&P 500 Index.
C. Dates of first default on bond payments for a group of bankrupt European manufacturing companies.

A

C. dates are ordinal data that can be sorted by chronological order, but not by value

30
Q

Which of the following best describes a time series?

A. Daily stock prices of the XYZ stock over a 60-month period.
B. Returns on four-star rated Morningstar investment funds at the end of the most recent month.
C. Stock prices for all stocks in the FTSE100 on 31 December of the most recent calendar year.

p. 67

A

A. a time series is a sequence of observations of a speicific variable collected over time (60 months)

31
Q

Concept

data available in their original format, typically unusable by humans or computers

p. 67

A

raw data

32
Q

Concept

the simplest format for representing a collection of data of the same data type, which is suitable for a single variable

p. 68

A

one-dimensional array
ex. vectors

33
Q

Concept

summarizes central tendency and spread variation in the data’s distribution

p. 68

A

descriptive statistics

34
Q

Describe

descriptive statistics

p. 68

A

summarizes central tendency and spread variation in the data’s distribution

35
Q

Concept

is a tabular display of data constructed either by counting the observations of a variable by dinstict values or groups or by tallying the values

p. 71

A

frequency distribution

36
Q

steps

Constructing a frequency distribution of a categorical variable

p. 71

A
  1. count the number of observations for each unique value of the variable
  2. construct a table listing each unique value and the corresponding counts, and then sort the records
37
Q

Concept

the raw frequency that is the actual number of observations counted for each unique value

p. 71

A

absolute frequency

38
Q

Describe

absolute frequency

A

the raw frequency that is the actual number of observations counted for each unique value

39
Q

Concept

is calculated as the absolute frequency of each unique value of the variable divided by the total number of observations

A

relative frequency

40
Q

Describe

relative frequency

A

is calculated as the absolute frequency of each unique value of the variable divided by the total number of observations

41
Q

pitfalls

binning data and constructing intervals

p. 74

A
  • if we use too few bins, we wil summarize too much and may lose pertinent characteristics
  • if we use too many bins, we may not summarize enough, and potentially introduce noise into the data
42
Q

Concept

adds up the absolute frequencies as we move from the first bin to the last bin

p. 74

A

cumulative absolute frequency

43
Q

Describe

cumulative absolute frequency

p. 74

A
  • adds up the absolute frequencies as we move from the first bin to the last bin
  • for the last bin, the cumulative absolute frequency will equal the number of observations in the dataset
44
Q

Concept

is a sequence of partial sums of the relative frequencies

p. 74

A

cumulative relative frequency

45
Q

cumulative relative frequency

A

is a sequence of partial sums of the relative frequencies

46
Q

Concept

is a tabular format that displays the frequency distributions of two or more categorical variables simulatneously and is used for finding patterns between the variables

p. 77

A

contingency table

47
Q

Concept

a contingency table for two categorical variables

p. 77

A

two-way table

48
Q

Concept

A contingency table having R levels of one variable in rows and C levels of the other variable in columns

p. 77

A

R x C table

49
Q

name the data representation

p. 78

A

5 x 3 contingency table

50
Q

Name the data type

p. 78

A

joint frequencies

51
Q

Name the data type

p. 78

A

marginal frequencies

52
Q

Name the table

p. 80

A

Confusion Matrix for Bond Default Prediction Model

53
Q

Describe

chi-square test of independence

p. 80

A
  • A way to test for a potential association between categorical variables
  • the procedure involves constructing a contingency table
  • the actual values and expected values are used to derive the chi-square test statistic
54
Q

Concept

the actual values and expected values from a contingency table are used to derive this value

p. 80

A

chi-square test statistic

55
Q

Describe how the contingency table is used to set up a test for independence between fund style and risk level.

p. 81

A