LLM Flashcards

1
Q

What are 4 common methods of building LLM based applications?

A
  1. Training models from scratch
  2. Fine Tuning Open Source Models
  3. Using Hosted APIs
  4. In-Context Learning

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the core idea of In-context learning?

A

To use LLMs “off the shelf” (i.e. without any fine tuning), then control their behavior through clever prompting and conditioning on private “contextual” data.

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a context window?

A

The amount of text that can be entered/processed in one prompt.

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the approximate context window of current GPT models?

A

50 pages of text.

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does in-context learning solve the context limit problem?

A

Instead of sending the complete volume of a dataset to be analyzed, we only send a subset and then determine which elements of of the complete dataset are most relevant by using….. LLMs!

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the 3 stages of the In-context learning workflow?

A
  1. Data preprocessing / embedding
  2. Prompt construction / retrieval
  3. Prompt execution / inference

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is involved in the Data preprocessing / embedding stage of the in-context workflow?

A

Storing private data (e.g. legal docs) to be retrieved later. Typically these datasets (documents) are broken into chunks, passed through an embedding model and then stored in a specialized database called a vector database.

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is involved in the Prompt Construction / retrieval stage of the in-context workflow?

A

When a user submits a query (e.g. a legal question). The application constructs a series of prompts to submit to the language model.

Each of these complied prompts consists of:
* Prompt template (often hard coded)
* Examples of valid outputs (few-shot examples)
* Additional required information acquired from external APIs
* Relevant documents retrived from a vector databases (loaded during pre-processing/embedding)

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is involved in the Prompt Execution / Inference stage of the in-context workflow?

A

After prompts are compiled in the retrieval stage, they are sent to a pre-trained LLM for inference.
Types of LLMs used here can include:
* Proprietary model APIs
* Open Source
* Self Trained

In some cases, logging, caching and validation are implemented in this stage.

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the alternative to in-context learning and why would that not be a palatble option?

A

Alternative: Training the LLM itself, which likely requires a team of ML-engineers, which is not the case for in-context learning.

Other Benefits:
* No Need to host your own infrastructure
* No need to buy an expensive instance from OpenAI

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How often does a specific piece of information need to occur in a training set before an LLM will remember it through fine-tuning?

A

At least ~10 times

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the problem with just increasing the context window of the underlying model?

A

Currently, the cost and time of inference scale quadratically witht the size of the prompt. Even linear scaling would be cost-prohibitive at this point (A 10k page GPT query could cost hundreds of dollars).

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are 4 types of files/documents that would be used for contextual data?

A
  1. Text
  2. PDF
  3. CSV
  4. SQL Extracts

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are 2 standard approaches used for data loading and trasformation in the pre-processing/embedding stage?

What are two examples of each?

A

Traditional ETL Tools
* Databricks
* Airflow

Document Loaders built into LLM orchestration frameworks:
* LangChain (powered by Unstructured)
* LlamaIndex (powered by Llama Hub)

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the most commonly used proprietary API(*i.e. an embedding model) used for *embedding?…and what is an emergency choice amonst enterprises?

A
  • OpenAI
  • Cohere (focuses specifically on embedding)

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a popular open-source library(i.e embedding model) for embedding?

A

The Sentence Transformers library from Hugging Face

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a vector database responsible for?

A

Efficiently storing, comparing and retrieving up to billions of embeddings (i.e. vectors).

Vector databases offer optimzed storage and query capabilities for embeddings.

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a vector embedding?

noun

A

A type of data representation that carries within it semantic information that’s critical for AI to gain understanding and maintain a long-term memory they can draw upon when executing complex tasks.

https://www.pinecone.io/learn/vector-database/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is contained within an embedding and why are they important?

A

They contain a large number of attributes (or features) each representing difference dimensions of the data that are essential for understanding patterns, relationships and underlying structures.

https://www.pinecone.io/learn/vector-database/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What does an embedding model do?

A

It creates vector embeddings for the context we want to index.

The vector embeddings take the form of arrays, but still maintain relationships between vecotrs that make sense (in the real world).

https://www.pinecone.io/learn/vector-database/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are 3 common similarity measures used by Pinecone?

A
  1. Cosine Similarity
  2. Euclidean Distance
  3. Dot Product

https://www.pinecone.io/learn/vector-database/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is involved in the cosine similarity similarity measure?

A

Measures the cosine of the angles between two vectors, with result raning from -1 to 1.

1= identical
0 = orthogonal
-1 = diametrically opposed

https://www.pinecone.io/learn/vector-database/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is involved in the Euclidean distance similarity measure?

A

This is a calculation of the straight line distance between two vectors.

Range of output is 0 -> inf
0 = identical

https://www.pinecone.io/learn/vector-database/

24
Q

What is involved in the Dot Product similarity measure?

A

It measures the product of the magnitude between two vectors and the cosine of the angle between them.

Range of Output: -inf to inf
> 0 = vectors pointing in the same direction
0 = orthogonal
< 0 = opposite directions

https://www.pinecone.io/learn/vector-database/

25
Q

What is a similarity measure?

A

A mathmatical method for determining how similar two vectors are in a vector space.

https://www.pinecone.io/learn/vector-database/

26
Q

What are the pros & cons of post-filtering a vector query?

A

Pros: Helps to ensure all relevant results are considered.

Cons: Additional overhead / query slowdown when filtering out irrelevant restuls after vector search is complete.

https://www.pinecone.io/learn/vector-database/

27
Q

What are the pros & cons of pre-filtering a vector query?

A

Pros: Helps to reduce the search space

Cons: may cause the system to overlook relevant results that don’t match filter criteria. Also, extensive filtering may slow down the query process due to additional overhead.

https://www.pinecone.io/learn/vector-database/

28
Q

What is vector indexing?

A

It is the process of mapping vectors into data structures that will enable faster searching.

https://www.pinecone.io/learn/vector-database/

29
Q

At a high level what is happening when we query a vector database?

A
  1. An indexed query vector is created and inserted into the database
  2. The query vector is compared to other vectors in the database using similarity measures.
  3. The most relevant results are returned.

https://www.pinecone.io/learn/vector-database/

30
Q

What are 2 definitions of a vector?

A
  1. A mathmatical structure with a size and direction.
  2. A point in space with the direction being an arrow from (0,0,0) to the point in the vector space.

https://www.pinecone.io/learn/vector-embeddings-for-developers/

31
Q

How can a vector be represented by a traditional data structure?

A

As an array of numbers.

https://www.pinecone.io/learn/vector-embeddings-for-developers/

32
Q

What is the process of vector embedding?

verb

A

A technique to allows us to take an object of virtually any data type and represent it as a vector.

https://www.pinecone.io/learn/vector-embeddings-for-developers/

33
Q

What is the key purpose behind embedding models?

A

They provide an approach for generating vectors that preserve the meaning of the input data such that the relationships between vectors makes sense.

https://www.pinecone.io/learn/vector-embeddings-for-developers/

34
Q

What is a popular example of an embedding model?

A

word2Vec

https://www.pinecone.io/learn/vector-embeddings-for-developers/

35
Q

What is a good tool for visualizing an embedding model?

A

Tensor Flow’s projector

https://www.pinecone.io/learn/vector-embeddings-for-developers/

36
Q

What is a common method for generating an embedding model?

A

Passing large amounts of labelled data to a neural network.

https://www.pinecone.io/learn/vector-embeddings-for-developers/

37
Q

What are 6 common uses of vector embeddings ?

A
  1. Semantic Search: go beyond keyword matching to get results with similar meaning
  2. Q & A Applications: model is trained with pairs of questions and ansers and can eventually provide answers to un-tested questions
  3. Image Search: e.g. CLIP, ResNet
  4. Audio Search: Shazam
  5. Reccomendor Systems: Amazon
  6. Anomaly Detection

https://www.pinecone.io/learn/vector-embeddings-for-developers/

38
Q

What is a synonym for vector?

A

embedding

39
Q

What is a synonym for embedding?

A

vector

40
Q

What is zero shot prompting?

A

Simple prompts consisting of direct instructions

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

41
Q

What is few-shot prompting?

A

“Few Shot” prompts are prompts that include multiple examples of output

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

42
Q

What is LangChain?

A

A (orchestration) framework designed to simplify the creation of applciations using LLMs

https://en.wikipedia.org/wiki/LangChain

43
Q

What are orchestration frameworks (e.g. LangChain and *LlamaIndex) used for?

A

They abstract away many of the details of prompt chaning, such as:
* Interfacting with external APIs
* Retrieving contextual data from vector DBs
* Maintaining memory across multiple LLM calls

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

44
Q

NEXT: Go through the AI Canon from AH.AI Canon

A
45
Q

What are LLMs like ChapGPT fundamentally trying to do?

A

Produce a reasonable continuation of whatever text its received so far…

….where ‘reasonable’ means “what we might expect someone to write after seeing what people have written on billions of webpages, etc”

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/

46
Q

When producing each new token in a continuation of text, what is an LLM doing?

A

Finding all instances of text with the same meaning and then analyzes which token comes next what fraction of the time?

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/

47
Q

What is the concept / parameter used to introduce randomness into text generation? What value has been shown to generate an optimal level of randomness?

A

temperatore

0.8

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/

48
Q

NEXT: return to the “Surely a Network that’s big enouch can do anything! seciont of this article

specifically where the “Training Progress” chart is

A

asdfasdf

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/

49
Q

What is a transformer in the context of ML?….what 3 characteristics make them uniqu?

A

Transformers are a type of neural network, with the following unique characteristics:

  1. Self Attention
  2. Positional Embeddings
  3. Multihead Attention

https://serokell.io/blog/transformers-in-ml

50
Q

What does GPT stand for?

A

Generative Pre-Trained Transformer

https://serokell.io/blog/transformers-in-ml

51
Q

What is a word embedding?
(Possible duplicate)

A

Vector representations of words.

https://serokell.io/blog/transformers-in-ml

52
Q

How does self-attention come into play within transformer architecture?

A

It allows the model to weight the importance of different parts of the input sequence against each other.

https://serokell.io/blog/transformers-in-ml

53
Q

How does multi-head-attention come into play within transformer architecture?

A

It allows the network to learn multiple ways of weighting the input sequence against itself.

The vectors responsible for tokens are broken up into multiple parts called “heads”, which all go through a similar attention computing process. This process can be parallelized, allowing for faster model training.

https://serokell.io/blog/transformers-in-ml

54
Q

NEXT: Where are Transformeres Used? section of this article

A
55
Q
A