Topic 1: Introductions – Organisation & ML Basics Flashcards

Question 1

Q

What is AI, ML and Deep Learning

Answer

A

AI - A computer that can mimic human behaviour

Machine learning - Learns to fit a model to data without explicit programming

Deep learning - Learns features in data using neural networks

Question 2

Q

Why is a probabilistic approach useful in machine learning?

Answer

A

From a probabilistic perspective, machine learning with real world data is uncertain with:

variance
ambiguity
transformations
partial information

We can have some unknown quantities that will be random variables.

If we have to make under some uncertainty, then the probabilistic approach is ideal, such as modelling different possible outcomes and their probabilities

Question 3

Q

What is supervised learning?

Answer

A

Learning from labelled data, so we know what the ground truth is
Classify or regress on data, so the goal is to learn a mapping from inputs to outputs, which can be done through classification or regression
Optimise cost, here we train the model by optimising a cost function, which measures how wrong the predictions are compared to the true labels

Question 4

Q

What is semi-supervised learning?

Answer

A

We’re looking for a partial target, we know what class an image belongs to, but we don’t why
State-action cycle, so a system is in a state, and it takes an action and that action will change the state and lead to feedback
Independent reward function, so such as the model will learn indirectly what is good, which is a result from the feedback

Question 5

Q

What is unsupervised learning?

Answer

A

We’re dealing with unlabelled data, so we don’t have a ground truth
We try to understand the relationships in the data, so which users behave similarly
We want to correlate variables, so which variables relate to each other

Question 6

Q

What are common ML tasks?

Answer

A

Classification: When there are separate classes where we learn a decision boundary of a rule (separate the classes). There is also a loss L determined by a loss function, l

Regresson: When you prediction the output for the input (make a regression from the data)
Transcription: we can take unstructured input like audio and output it to discrete text (speech recognition)

Translation: input could be some symbols, and the output could be another language

Structuring/Compression: we can re-organise data with regards to the relationships between the elements. Learning how the data is structured to then compressing it without losing information (e.g. PCA and autoencoder)

Anomaly detection: it can flag elements that are unusual/atypical, so finding outliers

Synthesis and sampling: it can generate new and similar data elements (GAN and VAEs)

Denoising: it can predict clean elements from corrupted ones, so it can remove noise

Density estimation: it can learn the probability distribution that generated the data (KDE, VAEs)

Question 7

Q

What is EDA and why is it important?

Answer

A

We use it to understand out data:
- we can find central tendencies
- we can find basic measures of shape and dispersion
- we can structure the patterns in the input data

Capture all your data well:
- Can we complete the labelling?
- Check for missing values
- Clean the data if sensible

Data representation:
- Find a representation that is suitable for your task
- should it be greyscaled? RGB?
- Should it be characters? words? vectors?

Consider automated data transformations:
- E.g. normalisation
- Encoding in a different space
- Reduce or augment the information

Question 8

Q

How is data represented for ML models?

Answer

A

As tensors that can be batched: multiple samples in one big matrix.
They are efficient for GPU computation.
They are the standard format for deep learning models.

Question 9

Q

What are some steps in data preprocessing?

Answer

A

We use is to encode our data:
Input and output dimensionality:
- Large amount of data can be processed as one tensor in parallel
- But it needs identical representation
- We can cut or pad our data

Data loading:
- Efficient storage: use Pandas & pickle, Hierarchical Data Format (HDF5)
- Efficient processing: Consider dataset packages of major ML frameworks

Question 10

Q

What are key metrics for supervised learning?

Answer

A

Accuracy: How many predictions were correct out of all predictions.
- If you guessed 8 animals correctly out of 10, your accuracy is 80%.
False-positive rate: How often the model says “yes” when it should say “no”.
- If the model thinks 2 dogs are cats (but they’re really dogs), that’s 2 false positives.
Precision: Of all the times the model said “Yes,” how often was it right?
- How many of my guesses were actually cats?
Recall: Of all the real cats, how many did the model find?
- How many real cats did I find?
F1-score: The balance between precision and recall. If both are high, F1 is high. If one is low, F1 drops.
- It’s useful when you care equally about being correct and finding all the right cases.
ROC: A graph that shows how well the model separates classes as you change the decision threshold.
- X-axis = False Positive Rate, Y-axis = True Positive Rate, The more the curve bends toward the top-left, the better.
AUC: The area under the ROC curve. Ranges from 0.5 (random guessing) to 1.0 (perfect).
- Tip: Higher AUC = better model at distinguishing classes.

Question 11

Q

What are key metrics for unsupervised learning?

Answer

A

For unsupervised learning (we need different measurements, as we have no labels, so we use various distance and similarity measures):

Minkowski distances
Intra/Inter-cluster distance
log-likelihood: Measures how likely the data is under your model.
- Higher is better, meaning your model “explains” the data well.
Perplexity

Question 12

Q

How does a basic neural network work?

Answer

A

It takes an input, x_1, x_2, x_3, and each input has a weight w_1, w_2, w_3, and it combines this into a perceptron by taking the sum of the input and the weights, then it can have an (nonlinear) activation function and we get the output.

y = f(X * W) = tanh(sum^n_k=1 x_k w_k)

It is a feature-learning algorithm, so it learns patterns or features from the raw data

Question 13

Q

What are low-, mid-, and high-level features in deep learning?

Answer

A

Instead of hand-crafting some features or picking which are important, neural networks can learn what features actually matter

low-level features = simple patterns such as lines and edges

mid-level features = parts of an object, such as eyes, nose or ears

high-level features = whole objects, such as faces

Question 14

Q

What are the basic steps of a machine learning project?

Answer

A

We define the task
- is it supervised (with labels)? semi-supervised (some labels)? unsupervised (no labels)?
- this step often needs a reasonable understanding of the data, so here EDA will be useful
represent your data
- we need an efficient transformation of our data into a numerical space, so taking raw text data e.g. and convert into tensors
- we need as few dimensions as possible, so do some dimensionality reduction
- here we need to do some preprocessing, e.g. normalisation or transforming
select your metrics
- the metrics depend on the task at hand, is it classification or regression?
- it also depends on the data, is it balanced, imbalanced?
develop your ML model
- We often need to train (validate) + evaluate on different metrics, to answer different questions. but always train and test on a whole separate dataset

Question 15

Q

Identify the key aspects of ML

Answer

A

How to:
Step 1: Define your task
Step 2: Represent your data
Step 3: Select your metrics
Step 4: Develop your ML model

How to develop the ML model:
Step 1: Define the architecture
Step 2: Define the activation and loss functions
Step 3: Select the optimiser
Step 4: Choose the training characteristics

Topic 1: Introductions – Organisation & ML Basics Flashcards

(15 cards)