Basics Flashcards
large language model (LLM)
LLMs are advanced computer models designed to understand and generate humanlike text. They’re trained on vast amounts of text data to learn patterns, language structures, and relationships between words and sentences.
Parameters
are the factors that the model learns during its training process, building the model’s understanding of language. The more parameters, the more capacity the model has to learn and capture intricate patterns in the data, improving its ability to produce humanlike text.
Fine-tuning
Fine-tuning is the process of further training a pre-trained model on a new dataset that is smaller and more specific than the original training dataset.
Imagine you’ve taught a robot to cook dishes from all over the world using the world’s biggest cookbook. That’s the basic training. Now, let’s say you want the robot to specialize in making just Italian dishes. You’d then give it a smaller, detailed Italian cookbook and have it practice those recipes. This specialized practice is like fine-tuning.
Fine-tuning is like taking a robot (or model) that knows a little bit about a lot of things, and then training it further on a specific topic until it becomes an expert in that area.
Einstein Trust Layer
Trust is the number one value at Salesforce. So it makes sense that Salesforce requires using Large Language Models (LLMs) in a secure and trusted way. And the key to maintaining this trust is through the Einstein Trust Layer. The Einstein Trust Layer ensures generative AI is secure by using data and privacy controls that are seamlessly integrated in the Salesforce end-user experience. These controls let Einstein deliver AI that securely uses retrieval augmented generation (RAG) to ground your customer and company data, without introducing potential security risks. In its simplest form, the Einstein Trust Layer is a sequence of gateways and retrieval mechanisms that together enable trusted and open generative AI.
Trusted Salesforce Agents
RAG
These controls let Einstein deliver AI that securely uses retrieval augmented generation (RAG) to ground your customer and company data, without introducing potential security risks. In its simplest form, the Einstein Trust Layer is a sequence of gateways and retrieval mechanisms that together enable trusted and open generative AI.
BYOM
Bring Your Own Model
If you are already investing in your own model, the bring your own model (BYOM) option can help.
You can benefit from Einstein even if you’ve trained your own domain-specific models outside of Salesforce while storing data on your own infrastructure. These models, whether running through Amazon SageMaker or Google Vertex AI, will connect directly to Einstein through the Einstein Trust Layer. In this scenario, customer data can remain within the customers’ trust boundaries.
The BYOM options are changing and fast! Keep an eye on the resources for new updates.
Accuracy - responsible AI principles
Agents should prioritize accurate results. We must develop them with thoughtful constraints like topic classification, a process where user inputs are mapped to topics that contain a relevant set of instructions, business policies, and actions to fulfill that request. This provides clear instructions on what actions the agent can and can’t take on behalf of a human. And, if there is uncertainty about the accuracy of its response, the agent should enable users to validate these responses whether through citations, explainability, or other means.
Agentforce ensures that generated content is backed by verifiable data sources, allowing users to cross-check and validate the information. Powered by the Atlas Reasoning Engine, the brain behind Agentforce, it also enables topic classification to set clear guardrails and ensure reliable results.
Safety - responsible AI principles
We must mitigate bias, toxicity, and harmful outputs by conducting bias, explainability, and robustness assessments, and ethical red teaming. Agent responses and actions should also prioritize privacy protection for any personally identifying information (PII) present in the data used for training and create guardrails to prevent additional harm.
Agentforce includes built-in toxicity detection mechanisms through the Einstein Trust Layer, a robust set of guardrails that protect the privacy and security of customer data, to flag potentially harmful content before it reaches the end user. This is in addition to default model containment policies and prompt instructions that limit the scope of what an AI agent can and will respond to. For example, an LLM can be guided to prevent the use of gender identity, age, race, sexual orientation, socioeconomic status, and other variables.
PII
personally identifying information
Honesty - responsible AI principles
When collecting data to train and evaluate our models, we need to respect data provenance and ensure that we have consent to use data (e.g., open-source, user-provided). We must also be transparent that an AI has created content when it is autonomously delivered (e.g., a disclaimer in a chatbot response to a consumer, or use of watermarks on an AI-generated image).
Agentforce is designed with standard disclosure patterns baked into AI agents that send outbound content. Agentforce Sales Development Representative and Agentforce Service Agent, for example, clearly disclose when content is AI-generated to ensure transparency with users and recipients or when engaged in conversations with customers and prospects.
Empowerment - responsible AI principles
We build agentic AI to supercharge human capabilities, enabling everyone to achieve more in less time and focus on what matters most. Accessibility is a foundational element of this effort, ensuring our AI solutions empower all individuals, including people with disabilities, by enhancing independence, productivity, and opportunities. In some cases, it is best to fully automate processes, but in others, AI should play a supporting role to humans — especially where human judgment is required.
Agentforce empowers people to take control of high-risk decisions while automating some routine tasks, ensuring humans and AI work together to leverage respective strengths.
Sustainability - responsible AI principles
Model developers should focus on creating right-sized models where possible to reduce their carbon footprint. When it comes to AI models, larger doesn’t always mean better: In some instances, smaller, better-trained models outperform larger, general-purpose models. Additionally, efficient hardware and low-carbon data centers can further reduce environmental impact.
Agentforce leverages a variety of optimized models, including xLAM and xGen-Sales developed by Salesforce Research, which are specifically tailored to each use case. This approach enables high performance with a fraction of the environmental impact.
EDA
Exploratory data analysis (EDA) is usually the first step in any data project. The goal of EDA is to learn about general patterns in data and understand the insights and key characteristics about it.
Training and performance - Data Quality in AI
The quality of the data used for training AI models directly impacts their performance. High-quality data ensures that the model learns accurate and representative patterns, leading to more reliable predictions and better decision-making.
Accuracy and bias - Data Quality in AI
Data quality is vital in mitigating bias within AI systems. Biased or inaccurate data can lead to biased outcomes, reinforcing existing inequalities or perpetuating unfair practices. By ensuring data quality, organizations can strive for fairness and minimize discriminatory outcomes.
Generalization and robustness - Data Quality in AI
AI models should be able to handle new and unfamiliar data effectively, and consistently perform well in different situations. High-quality data ensures that the model learns relevant and diverse patterns, enabling it to make accurate predictions and handle new situations effectively.
Trust and transparency - Data Quality in AI
Data quality is closely tied to the trustworthiness and transparency of AI systems. Stakeholders must have confidence in the data used and the processes involved. Transparent data practices, along with data quality assurance, help build trust and foster accountability.
Data governance and compliance - Data Quality in AI
Proper data quality measures are essential for maintaining data governance and compliance with regulatory requirements. Organizations must ensure that the data used in AI systems adheres to privacy, security, and legal standards.
data lifecycle
collection, storage, processing, analysis, sharing, retention and disposal
Machine learning
uses various mathematical algorithms to get insights from data and make predictions
Deep learning
uses a specific type of algorithm called a neural network to find associations between a set of inputs and outputs. Deep learning becomes more efficient as the amount of data increases.
Natural language processing
is a technology that enables machines to take human language as an input and perform actions accordingly.
Large language models
are advanced computer models designed to understand and generate humanlike text.
Computer vision
is technology that enables machines to interpret visual information.