Quant Flashcards

Question

Dummy Variable

Answer 1

Independent variable that takes on a value of either 0 or 1 also called indicator variable

Answer 2

1) Intercept Dummy 2) Slope Dummy 3) Interaction Term

Answer 3

1) Raise it to power of e, this is odds 2) Take odds/(1+odds), this is probability

Answer 4

A method to assess the fit of logistic regression models that is based on the log-likelihood metric that describes the model's fit to the data LR = -2 * (Log-likelihood of restricted model - log-likelihood of unrestricted model)

Answer 5

1 / sqrt(T), where T is number of observations, uniform for every observation

Answer 6

A key assumption to make a valid statistic inference in time series models 1) Expected value must be constant and finite in all periods 2) Variance must be constant and finite in all periods 3) Covariance must be constant and finite in all periods

Answer 7

Correlations of a time series with its own past values

Answer 8

b(0) / (1-b(1))

Answer 9

The square root of the average squared forecast error, used to compare the out-of-forecast performance of forecasting models Smallest RMSE is most accurate

Answer 10

First difference the time series because it makes it covariance stationary

Answer 11

Dickey-Fuller Test The null hypothesis is that a unit root is present, so rejected the null is to say the time series is covariance stationary

Answer 12

A time series that is not covariance stationary has a unit root and is therefore a random walk When the absolute value of the lag coefficient (b1) is 1 or greater than 1, unit root is present

Answer 13

If we are mapping two series and both have a unit root, they are co-integrated, meaning they move together, and a relationship can be established between the two

Answer 14

b(0) / (1-b(1)), where b0 and b1 are the coefficients in the model you're referencing

Answer 15

A value of 2 means there is no serial autocorrelation 2-4 is negative correlation 0-2 is positive correlation 1.5-2.5 is safe zone where you can use the results

Answer 16

When one of the independent models you are using is a lagged dependent variable

Answer 17

1) Take difference between mean and forecasts 2) Square the differences 3) Sum the squares 4) Divide by the number of observations to get the mean 5) Take square root of the mean The lower the RMSE the more accurate the model

Answer 18

coefficient/standard error for each b term (or respective t stat) and compare to critical t stat if not greater, not significantly different from 0 and therefore not covariance stationary, and also has a unit root

Answer 19

Null is there is unit root, so if T stat below critical value, there is unit root

Answer 20

If B0 is 0 and B1 is 1

Answer 21

Representation of text that describes the occurrence of words within a document

Answer 22

The process of replacing extreme values and outliers with the maximum and minimum points

Answer 23

TP/TP+FN -> uses first column only

Answer 24

TP/TP + FP -> Uses first row only

Answer 25

classification of labeled data and regression not used for unlabeled data

Answer 26

Overfitting

Answer 27

Splitting a given word into text or characters

Answer 28

Unsupervised technique where partitions observation into a fixed number, k, of non-overlapping clusters. Each cluster is characterized by its center (centroid) and each observation is assigned to the cluster with the centroid it matches closest with

Answer 29

The sample correlation between the regression residuals

Answer 30

discrete variables, where traditional regression is suited for continuous variables

Answer 31

In supervised learning, target is the y (dependent variable) and features are the x (independent variable)

Answer 32

The number of features in a model

Answer 33

The degree to which a model fits the data

Answer 34

Due to randomness in the data

Answer 35

How much the model changes to new observations

Answer 36

Curve that plots the accuracy rate

Answer 37

Adds a penalty to the objective function for observations that are misclassified in a SVM model

Answer 38

A supervised learning technique that classifies a new observation by finding similarities between this observation and the existing data

Answer 39

a supervised learning technique that can be used to predict either a categorical or target variable, typically used on binary classification or regression

Answer 40

a regularization technique used in CART models to reduce the dimensions of the model

Answer 41

Combining the predictions from a collection of models

Answer 42

- bootstrap aggregating - the original training data is used to generate new training data

Answer 43

A collection of a large number of decision trees via bagging

Answer 44

Harmonic mean of recall and precision (2*P*R) / (P+R)

Answer 45

a unsupervised technique to reduce dimensions

Answer 46

a variable that combines two or more variables that are statistically strongly related to each other

Answer 47

in the context of PCA, a vector that defines new mutually uncorrelated composite variables that are linear combinations of the original features

Answer 48

A measure that gives the proportion of total variance in the initial dataset that is explained by each eigenvector

Answer 49

a Plot that shows the proportion of total variance in the data explained by each principal component

Answer 50

Iterative procedure used to build a hierarchy of clusters

Answer 51

a bottom-up hierarchical clustering method that begins with each observation being treated as its own cluster

Answer 52

A top-down hierarchical clustering method that starts with all observations belong to a single large cluster

Answer 53

a type of tree diagram used for visualizing a hierarchical cluster analysis

Answer 54

A functional part of a neural network's node that multiplies each input value received by a weight and sums the weighted values to form the total net input, which is then passed to the activation function

Answer 55

A functional part of a neural network's node that transforms the total net input received into the final output of the node

Answer 56

The process of adjusting weights in a neural network, to reduce total error of the network, by moving backward through the network's layers

Answer 57

a Parameter that affects the magnitude of adjustments in the weights in a neural network

Answer 58

The process of adjusting weights in a neural network, to reduce total error of the network, by moving forward through the network's layers

Answer 59

Neural networks with many hidden layers, at least 2, but often more than 20

Answer 60

Machine learning in which a computer learns from interacting with itself or data generated by the same algorithm

Answer 61

1) Volume 2) Variety 3) Velocity

Answer 62

Process of converting inflected forms of a word into its base word (analyzing -> analyz)

Answer 63

Process of converting inflected forms of a word into its morphological root (analyzing -> analyze)

Answer 64

A collection of distinct set of tokens from all the texts in a sample dataset, but does not capture the position or sequence of those words the next step after cleansing data

Answer 65

last step of text processing uses the BOW Matrix where each row belongs to a document and each column represents a token

Answer 66

a representation of word sequences, unigram, bigram,trigram etc.

Answer 67

FP / (TN+FP)

Answer 68

TP / (TP + FN)

Answer 69

Where the cost of FP/Type 1 Error is high

Answer 70

When cost of FN/Type 2 error is high

Answer 71

linear data

Answer 72

The accuracy of data

Answer 73

The data conflicts with what it should be (male in name column), "it doesn't make sense" data point

Answer 74

Data not presented in same format

Answer 75

New Variable is created using existing data

Answer 76

Feature selection minimizes overfitting and feature engineering minimizes undercutting

Answer 77

(value - min) / (max - min)

Answer 78

0%, this is unsupervised data set

Answer 79

When the result is outside the meaningful range

Answer 80

if the relationship between the dependent and independent variables is strong, the SEE will be low Sq (MSE) MSE = SSE / n-k-1

Answer 81

t = (r * sq(n-2)) / (sq(1-r^2))

Answer 82

SSE / n-k-1

Answer 83

MSR / MSE MSR formula = RSS / k MSE = SSE / n-k-1

Answer 84

at least one of the coefficient is significantly different than 0, which is good for explanatory reasons

Answer 85

Type 1 errors

Answer 86

type 2 errors

Answer 87

1) Regression 2) Classification

Answer 88

If the target variable is continuous (supervised learning)

Answer 89

If the target variable is categorical or ordinal, such as company rating (Supervised learning)

Answer 90

1) Dimension reduction 2) Clustering

Answer 91

supervised learning

Answer 92

EITHER continuous or categorical

Answer 93

unsupervised / clustering / bottom up

Answer 94

unsupervised / provides insight into the volatility contained in a data set

Answer 95

supervised

Answer 96

supervised / regression

Answer 97

technique for mitigating excess reduction of the training set size by reshuffling the training set

Answer 98

1) CART provides visual 2) CART does not require initial hyper parameters set 3) CART does not require to specify a similarity measure

Answer 99

when prediction error on test data is minimzed

Answer 100

underfitting

Answer 101

bias error

Answer 102

underfitting

Answer 103

indeterminable

Answer 104

first-differenced regression

Answer 105

serial correlation

Answer 106

when x1 is greater than 1

Answer 107

A word that is so common in a text that it carries no meaning

Answer 108

lowercasing, removing stop words, stemming and lemmatization

Answer 109

data sparseness and low frequency tokens

Quant Flashcards

(142 cards)