Statistics/Probability Flashcards

(88 cards)

1
Q

sample space

A

the set of all possible sample points for an experiment, e.g. S={HH,TT,HT,TH} for two times head tails flip

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

dependent events regarding probability

A

e.g. picking marbles out a bag

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Covariance

A
  • When calculated between two variables, X and Y, it indicates how much the two variables change together.
  • Cov(X,Y)=E[(X−EX)(Y−EY)] = E[XY]−(EX)(EY)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

P–P plot

A

probability–probability plot or percent–percent plot or P value plot: probability plot for assessing how closely two data sets agree, or for assessing how closely a dataset fits a particular model.

It works by plotting the two cumulative distribution functions against each other; if they are similar, the data will appear to be nearly a straight line.

For input z the output is the pair of numbers giving what percentage of f and what percentage of g fall at or below z.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Q–Q plot

A

quantile–quantile plot: for comparing two probability distributions by plotting their quantiles against each other. A point (x, y) on the plot corresponds to one of the quantiles of the second distribution (y-coordinate) plotted against the same quantile of the first distribution (x-coordinate).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

PMF (Probability Mass Function)

A

A probability mass function (PMF) is a mathematical function that calculates the probability that a discrete random variable will be a specific value. It assigns a particular probability to every possible value of the variable.

Table: With each row an outcome + probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

the conditional probability for a cancel given snow.

A

P(Cancel∣Snow), the ∣ is short for ‘given’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

An event happens independently of a condition if

A

P(event∣condition)=P(event)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Kolmogorov-Smirnov (K-S) Test

A

non-parametric test that compares the empirical distribution of the data with a theoretical distribution.

It helps determine how well the theoretical distribution fits the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

K-S Statistic

A

The K-S statistic measures the maximum distance between the empirical cumulative distribution function (ECDF) of your data and the cumulative distribution function (CDF) of the theoretical distribution.

In simpler terms, it quantifies the biggest difference between what you observed (your data) and what you would expect if the data followed the theoretical distribution.

The K-S statistic ranges from 0 to 1:
A smaller K-S statistic indicates that the empirical distribution is very close to the theoretical distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

outcome = model + error –> how are the parts called?

A

model = systematic part, error = unsystematic part

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Descriptive Statistics

A

collect, organize, display, analyze, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Inference Statistics

A
  • Predict and forecast values of population
    parameters
  • Test hypothesis and draw conclusions about values
    of population parameters
  • Make decisions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Central Tedency

A

1st moment - mean, median, mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Spread

A

2nd moment - MAD, Variance, SD, coefficient of variation (CV = SD/mean), range, IQR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Skweness

A

3rd moment - measure of asymmetry, positive skew (tail pointing to high values (body of the distribution is to the left), negative skew

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Kurtosis

A

4th moment - Measure of heaviness of the tails, leptokurtic (heavy tails), platykurtic (light tails)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Which kurtosis has a normal distribution?

A

3 (mesokurtic)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

statistical test on prices vs returns:

A

prices are not predictable, returns are predictable (they are “stationary”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Standard Error calculation & meaning

A
  • SE = SD / (n^1/2)
  • Standard deviation measures the amount of variance or dispersion of the data spread around the mean. The standard error can be thought of as the dispersion of the sample mean estimations around the true population mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Sample standard deviation

A

𝑠

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Population standard deviation

A

𝜎 (sigma)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Central Limit Theorem

A

states that: the distribution of sample mean, 𝑋ത, will approach a Normal distribution as sample size 𝑛 increases (𝑛 ≥ 30)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Sample variance - do you use n or n-1?

A

n-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Random variable:
𝑋
25
Cumulative Density Function of Standard Normal:
Φ (z)
26
Pivotal distribution
N(0,1)
27
Population mean - greek letter:
μ (mu)
28
Confidence interval
sample mean +/- z-value * (sigma or SE / root(n))
29
In the sample, you approximate mu and sigma with...
x (sample mean) and sample standard deviation
30
Population standard deviation
𝜎
31
Particular observation of a Standard Normal (also known as ‘z-critical value’)
z
32
Parameter of 𝒕-distribution (also known as ‘degrees of freedom’):
𝜐
33
t-critical value
t
34
Important: are you given sigma or s?
If n is < 30, but you are given sigma, you can use sigma
35
t-distribution
* Has thicker tails than Normal (i.e. larger chance of extreme events). * Its shape depends on a single parameter “nu” 𝜈 = 𝑛 – 1, where n is the number of observations. * Assumption: t-distribution assumes that the data originates from a Normal Distribution.
36
3 main types of distribution
Gaussian, Poisson, Chi-square
37
Statistical stationarity:
A stationary time series is one whose statistical properties such as mean, variance, autocorrelation, etc. are all constant over time.
38
Measures the amount of variability within a single dataset - calculate SD for population and sample - comparison population vs sample SD calculation
39
What is variance
the expected value of the squared deviation from the mean of a random variable
40
Null Hypothesis vs Sample mean
* Null Hypothesis: 𝐻0 belief about true population parameter value --> The null hypothesis will be rejected if the difference between sample means is bigger than would be expected by chance * Sample mean: 𝐻1 alternative
40
Significance Level - letter
alpha --> probability of rejecting the null hypothesis when it is true
41
Critical value / Cutoff point
𝑧 − 𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 or 𝑡 − 𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 --> ±z/t-value which act as cutoff points beyond which the null hypothesis should be rejected
42
p-value: 𝑝
probability of obtaining a value of the test statistic as extreme as, or more extreme than, the actual value obtained, when the null hypothesis is true
43
How to report results for statistical hypothesis testing:
* “we accept the null hypothesis as truth” * “we cannot reject the null hypothesis”
43
Hypothesis:
is a statement of assertion about the true value of an unknown population parameter (e.g. 𝜇 = 100)
43
Null Hypothesis Test - performed using a test statistic, which is the standardised value derived from sample data, e.g. the standardized value of the sample mean When should you use the z-statistic vs. the t-statistic in hypothesis testing?
43
Types of Statistical Errors
43
Conventions in Your Industry regarding alpha
43
Calculating CI Cutoff Points Use t-dist when 𝑛 < 30 and 𝜎 is unknown, otherwise use 𝑧. In practice, we can use 𝑡 for all cases. R-functions for SD Distribution and t-Distribution
43
3 equivalent ways of testing hypothesis:
43
Equivalent Approaches for hypothesis testing
44
Does correlation reflect nonlinear relationships?
No
45
True Dependent Variable and Estimated Dependent Variable
46
True Coefficient and Estimated Coefficient
47
Residual Error and Residual Standard Error
47
Number of observations and number of independent variables
47
Coefficient of Determination / R Squared and Adjusted R Squared
48
Coefficient 𝒊’s Standard Error
48
What does regression diagnostics look for?
Testing for “significant” relationships.
48
assumed true model vs fitted model - how do you write the coefficients down for the regression equations?
49
Different names for Y and x
49
Fitted Model: Time-Series With Lagged Variables and Fitted Model: Autoregression - examples how they could look like:
50
OLS minimizes...
the Sum of Squared Errors (SSE) with respect to regression coefficients 𝛽0, 𝛽1
51
Residual Standard Error (RSE) - calculation/formula
* square root of the average squared residuals * where 𝑛 is the number of observations and 𝑝 number of independent variables.
52
What does 𝑅2 show?
The proportion of total variation of Y that is explained by the model (i.e. by the independent variable(s))
53
𝑅2 - calculation
54
Adjusted R2 - calculation/meaning
* If 𝑛 is very large and 𝑝 is very small, the ratio is close to zero, and 𝑅2 ≈ 𝑅2adj * As the number of inputs (𝑥’s) increases, 𝑅2 typically increases regardless of whether the variables are useful for prediction * Adjusted 𝑅2 will only increase if the new 𝑥 variable improves the model more than would be expected by chance.
55
Assumptions for errors in linear regression models:
56
How is constant, time independent variance in linear regression models called?
homoscedasticity
57
What can be conclusions of nonnormal residuals?
nonlinearity present, interactions between independent variables, outliers
58
Possible Reasons For Systematic Errors in linear regression models:
* Nonlinearity: systematic pattern in the residuals * Heteroscedasticity: variance of errors changes across levels of independent variable * Autocorrelation: errors in one period are correlated with errors in another period
59
Common Nonlinear Transformations
* 1/𝑥 Relationship * square root(𝑥) Relationship * x^2 Relationship * Exponential 𝑥^𝑏 Relationship
60
We could say that the regression line has reduced our uncertainty, as measured by variances, from ... to ...
* from the unconditional variance of s2y * to y s to the conditional variance of s2e * That is, a reduction of s2y - s2e * The reduction expressed as a fraction is called R-squared or R2
61
A regression output usually reports the ratios between â and its standard deviation, and between bˆ and its standard deviation ... which are referred to as "t-ratios", i.e. ...
62
If the residual distribution has very "fat tails", i.e. many more extreme values than you would like to see, it may be appropriate to think of using an alternative estimation technique, such as:
* Least Absolute Value rather than Least Squares * This approach will weight extreme values less although a different set of diagnostics will then have to be used
63
Common Nonlinear Transformations
* 𝑥2 Relationship - Example: return from an investment (Y), increases quadratically (exponentially) with an increasing investment (x). This could happen due to aggressive reinvestment and compounding returns. * square root(𝑥) Relationship - Example: stock volatility (𝑌) increases with a decreasing rate of volume (𝑥), i.e. y = square root(x)
64
What is an Interaction Term?
* an independent variable in a regression model that is a product of two independent variables * Sometimes the partial effect of the dependent variable with respect to independent variable can depend on magnitude of yet another independent variable
65
Hierarchy Principle - Interaction Term
66
What is Multicollinearity? Effects?
* appears when independent variables used inside the regression equation are highly correlated * Effects: fit is not improved much, additional variables add little information, may cause two or more variables to become insignificant but significance may be high if one variable is dropped, Parameter estimates are unreliable
67
Dummy Variable:
* A variable that takes on a value of 0 or 1 * Example: 1 war year, 0 no war, treated as a reference group (usually the majority of the data in the sample) * When a categorical variable has k categories (we call them ‘levels), we include only k-1 dummy variables in the regression model. * The category that is left out is usually the one with the most frequent observations and it acts as a reference.
67
Distributed lag model
A model for time series data in which a regression equation is used to predict current values of a dependent variable based on both the current values of an independent variable and/or its lagged (past period) values.
68
Nonparametric statistics
statistical method in which the data are not assumed to come from prescribed models that are determined by a small number of parameters, such as the normal distribution model and the linear regression model
69
RISK analysis toolbox in Excel - overview
70
RISK build in correlation
71
What is linear programming?
Linear programming is a special type of optimization model that sets up constraints as linear equations and solves them simultaneously while optimizing an objective function.
72
How to calculate portfolio variance for 3 stocks given weights, correlations, and SDs? How does the matrix multiplication look like?
73
What is the shadow price?
The amount of profit that an additional unit of available resources would yield