Optimizing Foundation Models Flashcards
Embedding is the process by which
text, images, and audio are given numerical representation in a vector space.
Embedding is usually performed by
a machine learning (ML) model.
Enterprise datasets, such as documents, images and audio, are passed to ML models as tokens and are vectorized. These vectors in an n-dimensional space, along with the metadata about them, are stored in purpose-built vector databases for faster retrieval.
Two words that relate to each other will have similar
embeddings.
Here is an example of two words: sea and ocean. They are randomly initialized and their early embeddings are diverse. As the training progresses, their embeddings become more
similar because they often appear close to each other and in similar context.
The core function of vector databases is to
compactly store billions of high-dimensional vectors representing words and entities. Vector databases provide ultra-fast similarity searches across these billions of vectors in real time.
The most common algorithms used to perform the similarity search are
k-nearest neighbors (k-NN)
cosine similarity.
Agents- Intermediary operations:
Agents can act as intermediaries, facilitating communication between the generative AI model and various backend systems. The generative AI model handles language understanding and response generation. The various backend systems include items such as databases, CRM platforms, or service management tools.
Agents - Action launch
Agents can be used to run a wide variety of tasks. These tasks might include adjusting service settings, processing transactions, retrieving documents, and more. These actions are based on the users’ specific needs understood by the generative AI model.
Agents - Feedback integration
Agents can also contribute to the AI system’s learning process by collecting data on the outcomes of their actions. This feedback helps refine the AI model, enhancing its accuracy and effectiveness in future interactions.
Human evaluation involves real users interacting with the AI model to provide feedback based on their experience. This method is particularly valuable for assessing qualitative aspects of the model, such as the following:
Human evaluation is often used for iterative improvements and tuning the model to better meet user expectations.
User experience: How intuitive and satisfying is the interaction with the model from the user’s perspective?
Contextual approriateness: Does the model respond in a way that is contextually relevant and sensitive to the nuances of human communication?
Creativity and flexibility: How well does the model handle unexpected queries or complex scenarios that require a nuanced understanding?
Benchmark datasets, on the other hand, provide a quantitative way to evaluate generative AI models. These datasets consist of predefined datasets and associated metrics that offer a consistent, objective means to measure model performances, like
Accuracy
Speed and Efficiency
Scalability
Creating a benchmark dataset is a
manual process that is necessary to properly evaluate LLM performances using RAG systems.
In practice, a combination of
both human evaluation and benchmark datasets is often used to provide a comprehensive overview of a model’s performance.
LLM as a judge
evaluation of LLM performance using a benchmark dataset can be automated using this
Fine-tuning is critical because it helps
Increase specificity:
Improve accuracy: Reduce biases:
Boost efficiency:
Fine-tuning - Instruction tuning
This approach involves retraining the model on a new dataset that consists of prompts followed by the desired outputs. This is structured in a way that the model learns to follow specific instructions better. This method is particularly useful for improving the model’s ability to understand and execute user commands accurately, making it highly effective for interactive applications like virtual assistants and chatbots.
Fine-tuning: Reinforcement learning from human feedback (RLHF):
This approach is a fine-tuning technique where the model is initially trained using supervised learning to predict human-like responses. Then, it is further refined through a reinforcement learning process, where a reward model built from human feedback guides the model toward generating more preferable outputs. This method is effective in aligning the model’s outputs with human values and preferences, thereby increasing its practical utility in sensitive applications.
Fine-tuning Adapting models for specific domains:
This approach involves fine-tuning the model on a corpus of text or data that is specific to a particular industry or sector. An example of this would be legal documents for a legal AI or medical records for a healthcare AI. This specificity enables the model to perform with a higher degree of relevance and accuracy in domain-specific tasks, providing more useful and context-aware responses.
Fine-tuning Transfer Learning
This approach is a method where a model developed for one task is reused as the starting point for a model on a second task. For foundational models, this often means taking a model that has been trained on a vast, general dataset, then fine-tuning it on a smaller, specific dataset. This method is highly efficient in using learned features and knowledge from the general training phase and applying them to a narrower scope with less additional training required.
Fine tuning Continuous pretraining:
This approach involves extending the training phase of a pre-trained model by continuously feeding it new and emerging data. This approach is used to keep the model updated with the latest information, vocabulary, trends, or research findings, ensuring its outputs remain relevant and accurate over time.
The data preparation for fine-tuning is distinct from initial training due to the following reasons
Specificity: The dataset for fine-tuning is much more focused, containing examples that are directly relevant to the specific tasks or problems the model needs to solve.
High relevance: Data must be highly relevant to the desired outputs. Examples include legal documents for a legal AI or customer service interactions for a customer support AI.
Quality over quantity: Although the initial training requires massive amounts of data, fine-tuning can often achieve significant improvements with much smaller, but well-curated datasets.
Key steps in fine-tuning data preparation Data Curation
Data curation: Although it is a continuation, this involves a more rigorous selection process to ensure every piece of data is highly relevant. This step also ensures the data contributes to the model’s learning in the specific context.
ROUGE is a set of metrics used to evaluate
automatic summarization of texts, in addition to machine translation quality in NLP.