Lecture 3 Flashcards

(37 cards)

1
Q

Waarvoor gebruiken streamingdiensten de kijkgedragingen van gebruikers?

A

Ze vergelijken jouw kijkgedrag met dat van andere gebruikers om te voorspellen welke shows je leuk vindt.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Welke wiskundige methode ligt ten grondslag aan het aanbevelingssysteem bij streamingdiensten?

A

Lineaire logistische regressie.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Hoe werkt lineaire logistische regressie in het context van aanbevelingen?

A

Het model voorspelt de waarschijnlijkheid dat je een bepaalde show leuk vindt op basis van gelijkenissen in kijkgedrag.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Wat is het verschil tussen sceptici en enthousiastelingen als het gaat om AI?

A

Sceptici focussen vaak op de basiswiskunde en bouwstenen van AI, terwijl enthousiastelingen zich vooral richten op de coole eindresultaten van AI.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Wat is het mechanisme achter veel van de coole dingen die AI kan doen?

A

Het analyseren van data (zoals gebruikersgedrag) en voorspellen op basis van statistische modellen zoals lineaire logistische regressie.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a limitation of the linear/logistic regression model in prediction?

A

It struggles with situations where the predictor needs to capture “either/or” logic, like deciding if a movie is good based on whether either Meryl Streep or The Rock stars in it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Give an example illustrating the limitation of logistic regression with the “either/or” problem.

A

If a movie has Meryl Streep or The Rock, you predict it’s good. But if it has both, logistic regression struggles because they excel in different genres, so combining them confuses the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the “XOR problem” in AI?

A

It’s a classic problem where AI models like logistic regression cannot learn the logical function of “this or that” (exclusive OR).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why do we need neural networks to solve the XOR problem?

A

Because neural networks can perform multiple logistic regressions in parallel and combine their results to capture complex logical relationships.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do neural networks improve on simple logistic regression?

A

Neural networks build intermediate predictors by combining logistic regressions in layers, which extract useful combinations of raw data and improve understanding.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the role of intermediate predictors in neural networks?

A

They create more informative features from the original data, enabling the AI to make more accurate predictions by considering more complex patterns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the “Universal Approximation Theorem”?

A

It states that neural networks can approximate any relationship between inputs and outputs, meaning they can model almost any logical or functional pattern.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why can’t simple logistic regression handle XOR but neural networks can?

A

Because logistic regression only models linear relationships, while neural networks combine many logistic regressions in layers, allowing them to model non-linear, complex relationships like XOR.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Summarize why neural networks are a much better way to predict data compared to simple logistic regression

A

Neural networks create multiple intermediate steps that transform raw data into meaningful features, enabling them to capture complex patterns and logic that logistic regression cannot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Does ChatGPT work exactly like a neural network?

A

Not exactly. It uses neural networks as building blocks within a broader architecture called a transformer model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are language models like ChatGPT trained to do?

A

Predict the next word in a sequence—not the correct answer, just the most likely one based on previous words.

17
Q

Why is predicting the next word difficult for AI?

A

While it’s a basic task, doing it well requires a lot of background knowledge, context awareness, and handling of word order.

18
Q

Why don’t regular neural networks work well for language tasks?

A

Because they don’t consider word order, which is crucial in understanding language.

19
Q

What model solves this problem of word order and meaning?

A

The transformer model.

20
Q

What does the encoder part of the transformer do?

A

Turns words into numbers that best capture the meaning of the input.

21
Q

What are embeddings?

A

Numerical representations of words based on their co-occurrence with other words.

22
Q

How are embeddings created in basic neural networks?

A

By trying to predict the next word and observing which words occur together frequently.

23
Q

What is positional encoding in transformers?

A

A numerical representation added to embeddings to capture a word’s position in the sentence.

24
Q

What happens when you combine embedding and positional encoding?

A

You get a richer numerical representation that includes both meaning and position (e.g., won = {0.4, 0.4, 0} + {1, 0.9, -0.8} = {1.4, 1.3, 0.8}).

25
What are skip connections in transformers?
Shortcuts that copy original input numbers to later layers to retain information and stabilize it (with “Add & Norm”).
26
What does FNN (Feed Forward Network) do in a transformer?
It transforms numbers into more informative numbers—just like what neural networks do.
27
What is attention in language models?
It allows the model to focus on relevant surrounding words to improve a word’s representation.
28
What is multi-head attention?
The model looks at the word from multiple angles, each focusing on different contextual relationships.
29
What are the Query, Key, and Value in attention?
- Query: What you're looking for (like a question) - Key: What each word offers (the content) - Value: The actual information retrieved
30
How does the attention mechanism work for a word like “won”?
The model calculates similarity between the Query of “won” and Keys of other words (like “prize”), then uses the corresponding Values to adjust the representation.
31
What does the equation "Won = 0.6 x 'won' + 0.3 x 'prize' + 0.02 x 'home'" represent?
A weighted combination of word meanings based on attention scores that help define “won” more accurately in context.
32
What does the decoder do in a transformer model?
Takes the final numerical representations and predicts the next word.
33
What part of the transformer does GPT use?
Only the decoder part, since its job is to generate text.
34
What kind of machine learning problems only use the encoder part?
Tasks like weather forecasting, where you don’t need to output text.
35
How is GPT trained beyond regular prediction tasks?
It’s updated using human feedback through reinforcement learning (e.g., RLHF—Reinforcement Learning from Human Feedback).
36
Why is GPT not “just” a language model anymore?
It can also function as a calculator, search engine, reasoning assistant, and more, by integrating multiple capabilities.
37