Past paper Flashcards

(19 cards)

1
Q

What are Supervised tasks ?

A

Supervised tasks use labels/targets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are unsupervised tasks ?

A

Unsupervised do not use labels/targets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a silhouette coefficient?

A

it is a performance evaulation metric for when we dont have labelled data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Describe the process of the silhouette coefficient ?

A

Value ranged from -1 to +1
-1 indicated incorrect clustering +1 is dense clusters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the name of the phenomenon of a model becoming less accurate over time ?

A

Data/ concept drift
Performance decay

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the pros and cons of reduced data set dimensionality ?

A

+Removes noisy information
-Removing information can affect model performance
+Lower dimensionality allows some algorithms to run faster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is normalisation ?

A

Process of scaling features without affect the distribution
It scales each feature from 0-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a sample ?

A

A singular observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a feature?

A

singular measured attribute

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is top down development ?

A

Applies previous knowledge via rules/choices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is bottom down development ?

A

Builds knowledge from observed data, the model learns the rules

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the first 2 steps of PPDAC?

A

Problem and plan

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the final 3 steps of PPDAC?

A

data
analysis
Conclusion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a non directional hypothesis?

A

Does not state how the independent variable affects the dependent variable
“if a car is red, its speed is affected”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a null hypothesis?

A

States there is no relation between either the dependent or independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What distribution pattern is symmetric with the mean and median close ?

A

Normal distribution

17
Q

What is optimization ?

A

To find the best option from a set of alternatives. Often trying to find the global optimum

18
Q

K-means clustering algorithm ?

A

.Scale the data
.Choose a K
.Initialise centroids
.Associate data points to nearest centroid
.Update by calculating mean of the allocated points
.Repeat association and update until convergence rule is met