Theory Cards: Econometrics, Statistics, Causal Inference Flashcards

Question

What are the 2 types of Panel Data Models? ## Footnote Econometrics

Answer 1

* Fixed Effects Model * Random Effects Model

Answer 2

It controls for unobserved variables that are constant over time but vary between entities, useful when unique characteristics could bias results. Example: analyzing how wages change over time within a group of individuals where we assume each person has unique, unobserved traits that don’t change over time

Answer 3

It’s used when variation across entities is assumed to be random and uncorrelated with predictors, ideal when entity-specific traits are randomly distributed. Example: If studying wage changes across different cities, assuming that individual city characteristics are randomly distributed and uncorrelated with factors like education or experience.

Answer 4

* Autoregressive (AR) models * Moving Average (MA) Models

Answer 5

An AR model uses past values of a variable to predict its future values, like predicting tomorrow’s temperature based on today’s.

Answer 6

An MA model uses past forecast errors to predict future values, adjusting for consistent under- or overestimation in past forecasts.

Answer 7

A time series is stationary if its statistical properties, like mean and variance, remain constant over time, which is essential for accurate modeling.

Answer 8

Common techniques include** Exponential Smoothing** (which gives more weight to recent observations) and **ARIMA** (AutoRegressive Integrated Moving Average) for stationary data.

Answer 9

**Heteroskedasticity**: Occurs when the variance of residuals (errors) isnt constant across observations. It can lead to inefficient estimates and affect the reliability of hypothesis tests. **Correction**: use techniques such as** robust standard errors **or **transform variables** (e.g. log transformation) to stabilize variance. **Example**: in a model predicting household income, high-income households might have more variance than low-income households, violating the homoscedasticity assumption.

Answer 10

**Autocorrelation**: Occurs when **residuals are correlated across time** (common in time series data). This violates the independence assumption in OLS and can lead to biased estimates. **Correction**: Use models that account for autocorrelation, like ARIMA, or add lagged variables to the model **Example**: Monthly sales data might show autocorrelation, as sales in one month can be correlated with sales in the previous month.

Answer 11

AIC is a measure used to compare models, balancing goodness of fit with model complexity; lower AIC indicates a better model.

Answer 12

Cross-validation evaluates model performance by splitting data into training and testing sets multiple times, often using K-fold cross-validation. Example: in a dataset with 1000 observations, we could use 10-fold cross-validation to train the model on the 90% of the data and test on 10%, rotating through all portions.

Answer 13

The Central Limit Theorem states that the sample mean’s distribution approaches normal distribution as sample size increases, regardless of the population's original distribution.

Answer 14

MLE estimates parameters by maximizing the likelihood that the observed data occurred under the model, used in logistic regression for probability estimation.

Answer 15

Bayesian inference uses Bayes’ theorem to update the probability of a hypothesis as new evidence emerges, combining prior beliefs with new data.

Answer 16

*** Normal (Gaussian) Distribution: **symmetrical, bell-shaped curve. Example: heights or test scores in a population *** Binomial Distribution:** Describes the number of successes in a fixed number of independent trials, each with the same probability of success. Example: Flipping a coin 10 times to get a certain number of heads *** Poisson Distribution: **represents the number of events occurring in a fixed interval of time or space, with a known constant mean rate and independent of previous events. Example: Number of customer arrivals at a store in an hour * **Exponential Distributon:** Continuous distribution, describes the time between independent events occurring at a constant average rate.

Answer 17

* **Properties: **Symmetric, bell-shaped curve; defined by mean (μ) and standard deviation (σ); 68% of data within ±1σ, 95% within ±2σ, and 99.7% within ±3σ. * **Examples: **Heights of people, measurement errors, exam scores. * **Use Case:** Commonly used for naturally occurring data that clusters around an average. * **Key concepts: **68-95-99.7 rule (about 68% of data falls within 1 standad deviation, 95% within 2, and 99.7% within 3) The width of the curve is defined by standard deviation To draw you need to know: * The avg measurement (mean) * The standard deviation of the measurements

Answer 18

* **Properties:** Discrete distribution; defined by number of trials (n) and probability of success (p); mean = np, variance = np(1-p); symmetric for p ≈ 0.5, skewed otherwise. * **Examples: ** Number of heads in 10 coin flips, number of conversions in a marketing campaign, pass/fail outcomes in a series of tests. * **Use Case: ** Used when there’s a fixed number of independent trials, each with the same probability of success. **Laplace's Rule of Succession:** 48 out of 50 positive reviews: add one more positive and one more negative --> 49/52 ~ 94.2% chance of having a positive experience

Answer 19

* **Properties: **Models count of events in a fixed interval; mean and variance are equal (λ); skewed for small λ, more symmetric as λ increases. * **Examples: **Number of customer calls per hour, website visits per minute, number of defects per unit. * **Use Case:** Suitable for modeling rare events over time or space, where events occur independently. * **Key Concept: **Lambda (A)- the average rate of occurence, which is the mean and the variance of the distribution *

Answer 20

* **Properties: **Models time until the next event; defined by rate parameter (λ) with mean = 1/λ and variance = 1/λ²; memoryless property (probability of future events does not depend on past events). * **Examples: **Time between arrivals in a queue, time until a machine failure, time until customer churn. * **Use Case**: Ideal for modeling “time until” events in systems with a constant rate.

Answer 21

* **Properties: **All outcomes within a specified range are equally likely; mean = (a + b) / 2, variance = (b - a)² / 12 for continuous uniform; flat distribution. * **Examples**: Random assignment within a time range, selecting random test data, random numbers within a set range. * **Use Case**: Used when all values within a range are equally likely, either in continuous or discrete form.

Answer 22

**Gamma Distribution** * **Properties**: Continuous, skewed distribution; defined by shape (k) and scale (θ) parameters; mean = kθ, variance = kθ²; as k increases, shape resembles a normal distribution. * **Examples**: Insurance claim amounts, waiting times in queuing systems, rainfall amounts. * **Use Case:** Suitable for modeling waiting times where multiple events need to happen, such as total claim costs or aggregated waiting times. **Beta Distribution** * **Properties**: Continuous distribution on [0,1]; defined by shape parameters α and β; mean = α / (α + β), variance = αβ / [(α + β)²(α + β + 1)]; flexible shape, can be symmetric or skewed. * **Examples**: Conversion rates, probability of success in Bayesian inference, proportions in surveys. * **Use Case:** Used for modeling probabilities, proportions, or percentages, especially in Bayesian statistics.

Answer 23

A confidence interval is a **range of values that estimates where the true population parameter lies**, based on sample data. It shows the uncertainty around an estimate, with a specific confidence level (e.g., 95%) indicating how sure we are that the true value falls within the range. **Example**: If you are running an A/B test on Marktplaats and estimate that a new feature increases user engagement by 10%, you might calculate a 95% confidence interval for this increase as [7%, 13%]. This means you are 95% confident that the true increase in user engagement due to the feature lies between 7% and 13%.

Theory Cards: Econometrics, Statistics, Causal Inference Flashcards

(47 cards)