MFDS 2 endsem Flashcards

Question 1

Q

SSE, MSE and MAE

Answer

A

SSE - sum of residuals squared
- lower value = best fit (not size independent)
MSE - mean of squared residuals
- Penalizes large errors more due to squaring
MAE - average of absolute differences between actual and predicted values
- more robust to outliers
- treats all errors equal

Question 2

Q

What is VIF and its importance?

Answer

A

VIF - it is a diagnostic tool in multiple linear regression used to detect multicolinearity
1 Detects multicolinearity
- coefficient estimates become unstable and unreliable
2 Improves interpretability
- hard to isolate effect of predictor
3 Prevents overfitting
- multicolinearity increases complexity without adding new info

Question 3

Q

Variable selection techniques

Answer

A

Help determine most relevant predictors for a model
1 Filter
- based on statistical analysis (ANNOVA, Chi-square test)
2 Wrapper
- predictive model to evaluate different subsets of variables (Forward selection, Backward elimination)
3 Embedded
-perform var select as a part of model training (Lasso regression, Tree-based)
4 Dimensionality reduction
- Transform variables to reduced sets (PCA)

Question 4

Q

What are the types of time series?

Answer

A

1 Based on no of variables
Univariate (date wrt time)
Multivariate (stock values wrt time)
2 Based on behaviour
Stationery (atm temperature wrt time)
Non Stationery (House pricing wrt time)
3 Based on Seasonality
Seasonal (winter clothing)
Trend (fashion, stock rise)
4 Based on type of data collection
Continuous (Breathing)
Discrete (Shopkeeper profit)
Irregular (Earthquake - randsom obs)

Question 5

Q

What is Stationerity and auto-correlation

Answer

A

Stationery - statistical properties stay the same
Auto-correlation - relation/similarities between time series and lagging components
PACF (partial auto correlation factor)

Question 6

Q

Provide an example where seasonal decomposition is necessary

Answer

A

Seasonal Decomp - Seasonal decomposition is the process of breaking down a time series into its distinct components.
Need :
Improve forecast accuracy
Increase interpretability
Isolate seasonal effects
Understand structure of the data

Question 7

Q

Difference between stationery and non stationery time series

Answer

A

Mean
Variance (const, changing-increasing)
Auto-covariance (depends only on lag)
Trend
Seasonality
Forecasting
Example

Question 8

Q

Optimization and its types

Answer

A

In ML and DS, optimization refers to the process of adjusting model parameters to minimize or maximize an objective function
Convex (LR - minimize convex function)
Gradient Based (Neural Network - derivative of loss function)
Gradient free (Generic Algorithm - when derivative not available)
Constraint (opt under constraints - Lagrange’s Mutlipliers, Quadratic program)
Un-Constraint (No contraints - LR)
Discrete (Takes only integer values of 0 and 1 - Travelling salesman problem)
Continuous (Takes all real values - Neural Network, Gradient Descent)

Question 9

Q

Gradient descent method

Answer

A

Minimizes certain quadratic function at some point

xnew = xold - af’(xold)
alpha = learning rate

ex. f(x) = x^2+4x+4
5 iterations (0,5.00)
with alpha = 0.1

Question 10

Q

Contrained Optimization and its techniques

Answer

A

minimize or maximize objective function subject to constraints (equality or inequality)
1 Lagranges multipliers (contraint->unconstraint, for equality constraint)
2 Karush Kuhn Tucker (genrailization of LM, for nonlinear inequality constraint)
3 Penalty method (add penalty for constraint violation, large penalty -)
4 Barrier method (barrier constraints on objective function to prevent solution from infeasible region)

Question 11

Q

Meta Heuristic Optimization

Answer

A

Algorithms for performing optimization
1 Genetic Algorithm
(population of solutions, ex. feature selection)
2 Particle Swarm opt
(particles fly in solution space, ex. Clustering Problems)
3 Simulated Annealing
(Accepts worse solution, ex. Travelling Salesman Problem)
4 Differential Evolution
(Works with vectors, recombination and mutation, ex. Parameter Optimization)
5 Ant Colony Opt
(build solutions based on pheromone trails, ex. shortest dist on graphs)

MFDS 2 endsem Flashcards

(11 cards)