Lecture 9: Feature Eng Flashcards
(5 cards)
Parameters
Parameter
Parameters are components of the mathematical construct that
generalizes the dataset in the form of a model or equation, e.g,
coefficients of linear regression
Values of parameters are mathematically derived or algorithmically
learned from the dataset
Parameters are inherently internal to the model
Hyper-Parameter
Hyper-parameter
Configuration variables whose values are not derived from the
dataset
Hyper-parameters are inherently external to the model
Hyper-parameter values must be decided prior to the training
process and are typically specified by the machine learning
engineer
Hyper-Parameter Tuning
Tuning of Hyper-parameters
Hyper-parameters might be many
e.g, for Decision Trees: Resampling method, Number of trees, Maximum
depth, Number of splits per node, Maximum number of samples per Leaf
etc.)
Values of each of these hyper-parameters are numerous
Resulting into numerous possible combinations (can be
visualized as a Grid)
There might exists ONE combination that will result in the best
performing model
Question is how do we find that combination?
Trying out various combinations manually might be
overwhelming
Essentially, an optimization problem
Running the target algorithm for possible combinations of
hyper-parameters is known as parameter sweep
Hyper-parameter Tuning: Parameter Sweep Strategies
Entire Grid Sweep
This is the case where the engineer doesn’t know the optimal
value of any of the hyper-parameters and needs to exhaust and
evaluate all possible options
Running the predictor algorithm with all possible hyper-parameter
value combinations
Most expensive option in terms of computational resource and run
time
Random Grid Sweep
Used when one or some of the optimal values are known and fixed
manually
The unknown ones are left for the sweep, where values are picked
randomly by the algorithm
Can improves speed of execution significantly
Random Sweep
Useful when some or all of the hyper-parameters are continuous
variables
Algorithms picks random values within a range
Number of iterations can be controlled by another hyper-parameter
Used when one or limited number of hyper-parameters are
targeted
Overfitting / Underfitting: Other Techniques
K-Fold Cross Validation during Model Training
K-Fold Randomized Stratified Data Partitioning
Training & Test Data Split
For Tree-based Algorithms: Tree Pruning