Quantitative Methods Flashcards

Question

Cook's Distance

Answer 1

Metric for identifying influential data points; How the estimated value if the regression changes after deleting an observation

Answer 2

Transforms Qualitative Dependent Variable into a Linear relationship with the independent variables

Answer 3

Method to assess the fit of Logistic Regression models | Higher values (closer to 0) are better

Answer 4

Time Series with Linear Trend

Answer 5

Commonly used with time series that have exponential growth

Answer 6

Time series model regressed on its own past values

Answer 7

1. The expected value of the Time Series must be constant and finite in all periods 2. The variance in the time series must be constant and finite in all periods 3. The covariance of the time series to itself must be constant and finite in fixed periods in the future and the past

Answer 8

The value of a time series in one period is the same as the one in the previous time period, with an error term added. ## Footnote Use Dickey Fuller Test

Answer 9

Used to test for a Unit root; If there is a unit root, then the time series is a random walk. | Test for g=0

Answer 10

Used to smooth out period to period fluctuations in time series models

Answer 11

Combines Autoregressive and Moving Average Time Series ## Footnote Can be very unstable

Answer 12

Way of Testing if an AR Model has Heterskedasticity

Answer 13

Long Term finanical or economic relationship exists and don't diverge in the long run

Answer 14

Infers patterns between inputs (features) and Outputs (targets); uses labeled data

Answer 15

Seeks to identify strucure in unlabeled data; Used in 1. Dimesion Reduction (reduce number of features) 2. Clustering

Answer 16

Does not generlize well to new data

Answer 17

Degree to which the model fits the training data; produces underfitting and in-sample errors

Answer 18

How much the model's results change in response to new data; Causes overfitting and out-of-sample errors

Answer 19

Due to Randomness of Data

Answer 20

Method of reducing overfitting

Answer 21

Used to randomize the data into training and validation samples

Answer 22

A type of Penalized Regression that applies as features are added to the regression

Answer 23

Paramater selcted by the researcher before learning begins

Answer 24

Optimally separates the data into two sets

Answer 25

Supervised learning technique used mostly for classification and sometimes for regression

Answer 26

Supervised learning used in both classificatio and regression. Commonly applied to binary classification or regression

Answer 27

Combining the predictions from a collection of models

Answer 28

Technique where orignial dataset is used to create *n* number of datasets

Answer 29

Large number of decision trees trained via a bagging method

Answer 30

Transform many highly correlated features of data into a smaller number of uncorrelated composite variables

Answer 31

Mutual uncorrelated composite variables that are linear combinations of the original features | Represents a direction

Answer 32

Represetns the proportion of the total variance explained by the eigenvectors

Answer 33

A form of Unsupervised learning

Answer 34

A form of unsupervised learning

Answer 35

1. Conceptualization of the Modeling Task 2. Data Collection 3. Data Preperation and Wrangling 4. Data Exploration 5. Model Training

Answer 36

1. Text Problem Formulation 2. Data (Text) Curation 3. Text Preperation and Wrangling 4. Text Exploration

Answer 37

When extreme values and outliers are removed from the dataset | Also called truncation

Answer 38

When extreme values or outliers are replaced by the maximum (minimum) values that are not outliers

Answer 39

Process of rescaling numeric variables in the range of [0,1]

Answer 40

Process of both centering and scaling the variables ## Footnote Data must be normally distributed to be effective