Lecture 3 Flashcards by Jet Schrauwers

Waarvoor gebruiken streamingdiensten de kijkgedragingen van gebruikers?

Ze vergelijken jouw kijkgedrag met dat van andere gebruikers om te voorspellen welke shows je leuk vindt.

How well did you know this?

Not at all

Perfectly

Welke wiskundige methode ligt ten grondslag aan het aanbevelingssysteem bij streamingdiensten?

Lineaire logistische regressie.

How well did you know this?

Not at all

Perfectly

Hoe werkt lineaire logistische regressie in het context van aanbevelingen?

Het model voorspelt de waarschijnlijkheid dat je een bepaalde show leuk vindt op basis van gelijkenissen in kijkgedrag.

How well did you know this?

Not at all

Perfectly

Wat is het verschil tussen sceptici en enthousiastelingen als het gaat om AI?

Sceptici focussen vaak op de basiswiskunde en bouwstenen van AI, terwijl enthousiastelingen zich vooral richten op de coole eindresultaten van AI.

How well did you know this?

Not at all

Perfectly

Wat is het mechanisme achter veel van de coole dingen die AI kan doen?

Het analyseren van data (zoals gebruikersgedrag) en voorspellen op basis van statistische modellen zoals lineaire logistische regressie.

How well did you know this?

Not at all

Perfectly

What is a limitation of the linear/logistic regression model in prediction?

It struggles with situations where the predictor needs to capture “either/or” logic, like deciding if a movie is good based on whether either Meryl Streep or The Rock stars in it.

How well did you know this?

Not at all

Perfectly

Give an example illustrating the limitation of logistic regression with the “either/or” problem.

If a movie has Meryl Streep or The Rock, you predict it’s good. But if it has both, logistic regression struggles because they excel in different genres, so combining them confuses the model.

How well did you know this?

Not at all

Perfectly

What is the “XOR problem” in AI?

It’s a classic problem where AI models like logistic regression cannot learn the logical function of “this or that” (exclusive OR).

How well did you know this?

Not at all

Perfectly

Why do we need neural networks to solve the XOR problem?

Because neural networks can perform multiple logistic regressions in parallel and combine their results to capture complex logical relationships.

How well did you know this?

Not at all

Perfectly

How do neural networks improve on simple logistic regression?

Neural networks build intermediate predictors by combining logistic regressions in layers, which extract useful combinations of raw data and improve understanding.

How well did you know this?

Not at all

Perfectly

What is the role of intermediate predictors in neural networks?

They create more informative features from the original data, enabling the AI to make more accurate predictions by considering more complex patterns.

How well did you know this?

Not at all

Perfectly

What is the “Universal Approximation Theorem”?

It states that neural networks can approximate any relationship between inputs and outputs, meaning they can model almost any logical or functional pattern.

How well did you know this?

Not at all

Perfectly

Why can’t simple logistic regression handle XOR but neural networks can?

Because logistic regression only models linear relationships, while neural networks combine many logistic regressions in layers, allowing them to model non-linear, complex relationships like XOR.

How well did you know this?

Not at all

Perfectly

Summarize why neural networks are a much better way to predict data compared to simple logistic regression

Neural networks create multiple intermediate steps that transform raw data into meaningful features, enabling them to capture complex patterns and logic that logistic regression cannot

How well did you know this?

Not at all

Perfectly

Does ChatGPT work exactly like a neural network?

Not exactly. It uses neural networks as building blocks within a broader architecture called a transformer model

How well did you know this?

Not at all

Perfectly

What are language models like ChatGPT trained to do?

Study These Flashcards

Predict the next word in a sequence—not the correct answer, just the most likely one based on previous words.

Why is predicting the next word difficult for AI?

Study These Flashcards

While it’s a basic task, doing it well requires a lot of background knowledge, context awareness, and handling of word order.

Why don’t regular neural networks work well for language tasks?

Study These Flashcards

Because they don’t consider word order, which is crucial in understanding language.

What model solves this problem of word order and meaning?

Study These Flashcards

The transformer model.

What does the encoder part of the transformer do?

Study These Flashcards

Turns words into numbers that best capture the meaning of the input.

What are embeddings?

Study These Flashcards

Numerical representations of words based on their co-occurrence with other words.

How are embeddings created in basic neural networks?

Study These Flashcards

By trying to predict the next word and observing which words occur together frequently.

What is positional encoding in transformers?

Study These Flashcards

A numerical representation added to embeddings to capture a word’s position in the sentence.

What happens when you combine embedding and positional encoding?

Study These Flashcards

You get a richer numerical representation that includes both meaning and position (e.g., won = {0.4, 0.4, 0} + {1, 0.9, -0.8} = {1.4, 1.3, 0.8}).

What are skip connections in transformers?

Shortcuts that copy original input numbers to later layers to retain information and stabilize it (with “Add & Norm”).

What does FNN (Feed Forward Network) do in a transformer?

It transforms numbers into more informative numbers—just like what neural networks do.

What is attention in language models?

It allows the model to focus on relevant surrounding words to improve a word’s representation.

What is multi-head attention?

The model looks at the word from multiple angles, each focusing on different contextual relationships.

What are the Query, Key, and Value in attention?

- Query: What you're looking for (like a question) - Key: What each word offers (the content) - Value: The actual information retrieved

How does the attention mechanism work for a word like “won”?

The model calculates similarity between the Query of “won” and Keys of other words (like “prize”), then uses the corresponding Values to adjust the representation.

What does the equation "Won = 0.6 x 'won' + 0.3 x 'prize' + 0.02 x 'home'" represent?

A weighted combination of word meanings based on attention scores that help define “won” more accurately in context.

What does the decoder do in a transformer model?

Takes the final numerical representations and predicts the next word.

What part of the transformer does GPT use?

Only the decoder part, since its job is to generate text.

What kind of machine learning problems only use the encoder part?

Tasks like weather forecasting, where you don’t need to output text.

How is GPT trained beyond regular prediction tasks?

It’s updated using human feedback through reinforcement learning (e.g., RLHF—Reinforcement Learning from Human Feedback).

Why is GPT not “just” a language model anymore?

It can also function as a calculator, search engine, reasoning assistant, and more, by integrating multiple capabilities.

Lecture 3 Flashcards

(37 cards)