midterm review Flashcards
(36 cards)
Define Simple Linear Regression
A dependent variable (ex. Y) is predicted from one independent variable (ex. X) based on a linear relationship
Define Regression
The relation/dependency of & between 2 variables
SLR equation
y = a + βx
Define Residuals
The differences between the real data & the line
Goal with SSR
To find THE line that minimizes the SSR
Define Stock “Beta”
Beta is a risk measure of stock investment, calculated as the coefficient of the market return
When |β| > 1…
Stock is riskier & its returns have greater volatility (change unpredictable)
When |β| < 1…
Stock is less risky & its returns swing less than market returns
Monthly return formula
Monthly return = (Current month-end price - Last month-end price)/Last month-end price
Basic set-up for lm() function
regression_analysis_result_name <- lm(Y ~ X, data)
How to calculate SD manually
- (y-µ)^2
- square all results from step 1 + divide by count
How to manually calculate Q1 & Q3
- split dataset into 2 halves
- find the median of each half
How to find IQR
Q3-Q1
How to calculate Lower & Upper Whisker
LW = Q1 - 1.5IQR
UW = Q3 + 1.5IQR
How to calculate Extreme Lower & Upper Whisker
eLW = Q1 - 3IQR
eUW = Q3 + 3IQR
Difference in using summarize() & mutate()
must use <- when mutating to save the new variable into a dataset
When do you $?
When you are referring to a specific dataset for a variable
Define Probability Density Curve
Density curve visualizes the probability distribution → how probabilities are distributed over the values of a random variable
Advantages of Probability Density Curve
- A more refined representation of data
- Facilitate the probability calculation (even if data is absent)
Describe Skewed distribution
Mean > Median → Right-skewed
Mean < Median → Left-skewed
Density Curve Properties
- A density curve must lie on or above the horizontal axis
- The area under the density curve always equal to 1 or 100%
> Cannot be on y-axis or be below x-axis
Relationship between Probability Density & Probability
Probability Density ≠ Probability
- For a continuous variable, discussing its probability of being a specific value is not meaningful because it always equals to zero
Meaning of PD & Probability
Probability = area
Likelihood = height = density = straight line
Representation of Normal Distribution
If a random variable follows a normal distribution, it is presented as:
X ~ N (µ, σ)
Mean → center of the curve
Stdev → wideness of the curve