Chapter 8 Flashcards Preview

ML4QS > Chapter 8 > Flashcards

Flashcards in Chapter 8 Deck (20)
Loading flashcards...

3 broad categories of goals of time series analysis

time series analysis:

1. understanding periodicity and trends

2. forecasting

3. control

> change the course of a temporal pattern


3 components time series analysis can be decomposed to

1. periodic variations

> daily, weekly, monthly seasonality

2. trend

> how its mean evolves over time

3. irregular variations

> after removing periodic and trend

> residuals


what does the concept of stationarity mean in the context of time series analysis?


we call a time series stationary if when its trends and periodic variations are removed and the residuals are constant over time

> both the expected mean and variation (as well as autocorrelation) of a time series is constant


time series analysis:

> what is autocorrelation?


autocorrelation with lag lambda measures the correlation between a time series and a shifted variant (by lambda steps) of itself


why can we not use ARMA models to model real time series?

> solution?

in real time series (e.g. mood of bruce) we often observe that they are not stationary

> do not meet assumptions of ARMA

>>> apply differencing to remove e.g. drift in the mean

> this what the ARIMA model does


what is often a disadantage when removing the mean from the data?

it increases the variance


whats the problem with backpropagation in recurrent neural networks?

> solution?

backpropagation does not take time into account

> when including time in the setup the new predictions do not only depend on the inputs but also on the values of the neurons in the previous step


>>> solution: unfold the network throuh time: create an instance of the network for each previous timestep we consider (backpropagation through time)


what is the underlying principle of reservoir computing?

reservoir computing:

> a huge reservoir of fully connected hidden neurons with randomly assigned weights

> weights from input to reservoir are also randomly assigned

> only the weights from the reservoir to the output layer are learned


why are echo state networks a special case of reservoir computing?

in ESN the connections within the reservoir can be cyclic

> this allows to model temporal patterns


ESN: what is the "washout time"?

washout time:

> initialization period that is neither used in the training or test period

> the reservoir/network needs to stabilize first


explain the echo state property

echo state property:

> the effect of a previous state r_i and a previous input x_i on a future state r_i+k should vanish gradually as time passes (k -> infinity) and not persist or even get amplified


explain dynamic systems model

> use them in which case?

dynamic systems models:

> knowledge based models

> represent temporal relationships between attributes and targets by means of differential equations

> assume only numerical states

>>> use those if we have some domain knowledge we want to use


how does simulated annealing work?

> when to use it?

simulated annealing:

> used to tune the parameters of e.g. a dynamic systems model

> make random steps in the parameters state and see whether performance improves

> moves that do not result in better performance can however still be accepted

>>> this helps exploring the whole parametter space 


simulated annealing:

> what does the "temperature" refer to?

simulated annealing: temperature

> moves in the parameter space that do not have a positive impact on the error are still accepted with a certain probability

> this probability decreases with running time

>>> "temperature" decreases with search time: the lower the temperature the less we explore the search space


genetic algorithms:

 > what is the basic starting point of a genetic algorithm?

genetic algorithm:

a population of candidate solutions, e.g. paramter vectors

> represented by means of binary strings (genotype)


genetic algorithm:

how does parent selection work?

parent selection:

1. assign a fitness value to each of the individuals (e.g. error of that specific parameter vector)

2. assign probabilities of being selected to the individuals based on their fitness


genetic algorithm: how does crossover work?


1. select pairs of individuals

2. perform crossover with probability p_c

> randomly select one point in the bit string and create two children

> take some part of individual 1 and the rest of individual 2

3. if we do not perform crossover, the two children are identical to mother and father


genetic algorithm: how does mutation work?


> for each individual resulting from crossover: flip each bit of the individual with probability p_m


multi-criteria optimization problem:

> definitio pareto dominance

a model instance a is dominated by another model instance b when the mean squared error obtained by model instance b is lower for at least one target and not higher for any of the other targets


multi criteria optimization problem:

> explain NSGA-II


1. run all of your models (parent and children) and find the pareto front

2. form new population by adding models, starting with the first pareto front, then the second, etc until new population is full

3. generate offspring using crossover and muation

4. repeat