{ "@context": "https://schema.org", "@type": "Organization", "name": "Brainscape", "url": "https://www.brainscape.com/", "logo": "https://www.brainscape.com/pks/images/cms/public-views/shared/Brainscape-logo-c4e172b280b4616f7fda.svg", "sameAs": [ "https://www.facebook.com/Brainscape", "https://x.com/brainscape", "https://www.linkedin.com/company/brainscape", "https://www.instagram.com/brainscape/", "https://www.tiktok.com/@brainscapeu", "https://www.pinterest.com/brainscape/", "https://www.youtube.com/@BrainscapeNY" ], "contactPoint": { "@type": "ContactPoint", "telephone": "(929) 334-4005", "contactType": "customer service", "availableLanguage": ["English"] }, "founder": { "@type": "Person", "name": "Andrew Cohen" }, "description": "Brainscape’s spaced repetition system is proven to DOUBLE learning results! Find, make, and study flashcards online or in our mobile app. Serious learners only.", "address": { "@type": "PostalAddress", "streetAddress": "159 W 25th St, Ste 517", "addressLocality": "New York", "addressRegion": "NY", "postalCode": "10001", "addressCountry": "USA" } }

IFN580 Week 7 Unsupervised Learning Flashcards

(16 cards)

1
Q

What is ‘curse of dimensionality’

A

when dimensionality increases, data becomes sparse and requires more data to learn properly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Both supervised and unsupervised require at least one:

A

input attribute

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Supervised learning differs from unsupervised learning in that supervised learning
requires:

A

at least one attribute

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When selecting features, are the attribute values changed in any way?

A

No, feature selection is simply selecting/excluding features without any change.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Both feature selection and Principal Component Analysis reduce the number of features for a given dataset. How does the process differ between these two techniques?

A

PCA creates new features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is one approach for selecting the optimal number of components for PCA?

A

Graph the sum of total variance of all components and look for elbow point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When does PCA work optimally?

A

PCA works best when the data follows a normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a scenario where PCA may not perform adequately?

A

It is sensitive to outliers and may not work optimally if data is sparser.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

t-SNE and UMAP are ___________ methods for ___________.

A

machine learning, dimensionality reduction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

t-SNE can only compute up to ? components.

A

3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is perplexity?

A

a hyperparameter is t-sne that controls how many neighbours each point considers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Which hyperparameters does UMAP use?

A

“nearest neighbours” (n_neighbors): Controls how UMAP balances local
structure versus global structure in the data.

“minimum distance” (min_dist), which controls the distance between the
points in the low dimensional representation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What algorithm does UMAP use for optimisation?

A

Uses deterministic graph Laplacian-based optimisation,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What algorithm does t-SNE use for optimisation?

A

Uses stochastic gradient descent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Can you combine multiple dimensionality reduction approaches?

A

Yes, use PCA to extract greater number of components and then UMAP or t-SNE to reduce to 2 components

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is PCA?

A

a technique used to reduce the dimensionality of a dataset but preserve the variance