Exam 3 Flashcards

(30 cards)

1
Q

What makes time series data different from what we have studied so far in this course?

A

This data has time or dates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the major components of a time-series Signal?

A

Level: Average value of the series
Trend: Increasing, decreasing, or static
Seasonality: Repetition in the data
Noise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the two models we can use to understand the major components?

A

multiplicative
additive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do we choose between multiplicative and additive models?

A

When to use multiplicative model: when repetition changes over time
When to use additive model: when multiplicative model statement isn’t true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

If a signal does not have seasonality, what should we expect to see in the graphs for the two models?

A

Additive = seasonality would be close to 0
Multiplicative = seasonality would be close to 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which components are present in all signals, and which are not guaranteed?

A

All series have level and noise, however trend and seasonality are not guaranteed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why is kmeans considered unsupervised learning?

A

We do not know which group the data belongs to before clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The steps for Kmeans clustering

A

Step 1: Randomly select K observations as initial cluster centroids (center)
* Step 2: Use a distance (similarity) metric for assigning each observation to one of K clusters
* Step 3: Recalculate cluster centroid
(center)
* Step 4: If any data points changed clusters in Step 2 AND we have not reached our max iterations, go back to Step 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Know the difference between the K in KNN and Kmeans

A

K in k-means: number of clusters
K in KNN: number of neighbors to compare for class assignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is convergence?

A

Convergence: no change in clusters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do we understand the similarity of a datapoint to each cluster center?

A

Most similar == smallest distance
* Euclidean distance: Because we are calculating a distance, the features must be numeric for K-means clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What do we use for predictors when using linear regression on
time-series data?

A

trend and seasonality as predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do we use seasonality as a predictor?
* What do we need to know to create a sub-interval for one repetition

A

Size of the sub-repetitions is the period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How are the training and validation sets different for time-series Datasets?

A

assumes the relationship between time steps is linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How does the time step affect the linear regression models for forecasting? (give example)

A

If we use quarters to represent seasonality, we end up with 4 linear models
* If we chose to use months,we end up with 12 linear models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How does linear regression differ for time-series data from traditional linear regression?
* Hint: How many lines are used to estimate the target?

A

Depends on the time your data is in and how you want to display it (months, days, etc)

17
Q

If seasonality does not exist

A

straight line graph

18
Q

what is the significance between multiplicative and additive

A

level and trend are going to be the same, you are wanting to see the difference between seasonality and noise (y-axis difference)

19
Q

which signals are always guaranteed and not

A

guarenteed = noise and level
not = trend and seasonality

20
Q

how does time models differ from linear

A

time (seasonality) is accounted and we are fitting multiple lines of data vs 1 to estimate repetition. training and validation also must be in chronological order

21
Q

KMeans is used to classify datapoints. (True or False)

22
Q

In linear regression, to determine the repetition sub-interval, we only need to know the time-step of the dataset.

23
Q

KMeans always converges to an ideal grouping.

24
Q

If linear regression is applied to a time-series dataset without seasonality, it will produce the same results as regular linear regression (ie MLR)

25
Noise graphs from both the additive and multiplicative models can be used to choose which decomposition model to use.
True
26
In K-means clustering, we group data based on an expected target.
False
27
Ideally, in clustering, the distance between centroids is minimized.
False
28
Your goal is to understand sales and demographic data from eight different store locations and identify the differences between a high performing stores vs a low performing stores. Which model would be best?
KMeans
29
You run a landscaping company and have tracked the last 3 years of demand for your services. What models can you use to help predict the demand for year 4?
Linear Regression
30
Select everything we need to know in order to use multiple linear regression with a time series signal
Trend Seasonality Time-step