AI Practice Test #2 Flashcards
Neural network
Neural networks consist of layers of nodes (neurons) that process input data, adjusting the weights of connections between nodes through training to recognize patterns and make predictions
Neural networks are composed of multiple layers of interconnected nodes (neurons). These nodes process input data and adjust the weights of the connections between them during the training phase. This process allows the network to learn to recognize patterns and make predictions based on the data.
via - https://aws.amazon.com/what-is/neural-network/
Cloud computing
Cloud computing refers to the on-demand delivery of IT resources and applications via the internet with pay-as-you-go pricing.
Cloud computing, as defined by AWS, is the on-demand delivery of IT resources and applications over the internet with pay-as-you-go pricing. This allows businesses to access computing power, storage, and applications as needed without investing in physical infrastructure.
https://aws.amazon.com/what-is-cloud-computing/
Reinforcement learning
Reinforcement learning focuses on an agent learning optimal actions through interactions with the environment and feedback, while supervised learning involves training models on labeled data to make predictions
Reinforcement learning is characterized by an agent that learns to make optimal decisions through interactions with the environment, receiving feedback in the form of rewards or penalties. This feedback helps the agent learn a policy to maximize cumulative rewards. In contrast, supervised learning involves training models using labeled datasets to make predictions or classifications based on the input data.
via - https://aws.amazon.com/what-is/reinforcement-learning/
Feature engineering
Feature engineering for structured data often involves tasks such as normalization and handling missing values, while for unstructured data, it involves tasks such as tokenization and vectorization
Feature engineering for structured data typically includes tasks like normalization, handling missing values, and encoding categorical variables. For unstructured data, such as text or images, feature engineering involves different tasks like tokenization (breaking down text into tokens), vectorization (converting text or images into numerical vectors), and extracting features that can represent the content meaningfully.
Structured data
Structured data can include numerical and categorical data
structured data may require less preprocessing
Unstructured data
Unstructured data includes text, images, audio, et cetera
Unstructured data typically requires more extensive preprocessing.
Self-supervised learning
It works when models are provided vast amounts of raw, almost entirely, or completely unlabeled data and then generate the labels themselves.
Foundation models use self-supervised learning to create labels from input data. In self-supervised learning, models are provided vast amounts of raw completely unlabeled data and then the models generate the labels themselves. This means no one has instructed or trained the model with labeled training data sets.
Reinforcement learning
Reinforcement learning is a method with reward values attached to the different steps that the algorithm must go through. So the model’s goal is to accumulate as many reward points as possible and eventually reach an end goal.
Supervised learning
In supervised learning, models are supplied with labeled and defined training data to assess for correlations. The sample data specifies both the input and the output for the model. For example, images of handwritten figures are annotated to indicate which number they correspond to. A supervised learning system could recognize the clusters of pixels and shapes associated with each number, given sufficient examples.
Data labeling is the process of categorizing input data with its corresponding defined output values. Labeled training data is required for supervised learning. For example, millions of apple and banana images would need to be tagged with the words “apple” or “banana.” Then machine learning applications could use this training data to guess the name of the fruit when given a fruit image.
Data labeling
Data labeling is the process of categorizing input data with its corresponding defined output values. Labeled training data is required for supervised learning. For example, millions of apple and banana images would need to be tagged with the words “apple” or “banana.” Then machine learning applications could use this training data to guess the name of the fruit when given a fruit image.
Unsupervised learning
Unsupervised learning algorithms train on unlabeled data. They scan through new data, trying to establish meaningful connections between the inputs and predetermined outputs. They can spot patterns and categorize data. For example, unsupervised algorithms could group news articles from different news sites into common categories like sports, crime, etc. They can use natural language processing to comprehend meaning and emotion in the article.
Amazon Q Business
Amazon Q Business is a fully managed, generative-AI powered assistant that you can configure to answer questions, provide summaries, generate content, and complete tasks based on your enterprise data. It allows end users to receive immediate, permissions-aware responses from enterprise data sources with citations, for use cases such as IT, HR, and benefits help desks.
Amazon Q Business also helps streamline tasks and accelerate problem-solving. You can use Amazon Q Business to create and share task automation applications or perform routine actions like submitting time-off requests and sending meeting invites.
Amazon Q Developer
Amazon Q Developer assists developers and IT professionals with all their tasks—from coding, testing, and upgrading applications, to diagnosing errors, performing security scanning and fixes, and optimizing AWS resources.
Amazon Q in QuickSight
With Amazon Q in QuickSight, customers get a generative BI assistant that allows business analysts to use natural language to build BI dashboards in minutes and easily create visualizations and complex calculations.
Amazon Q in Connect
Amazon Connect is the contact center service from AWS. Amazon Q helps customer service agents provide better customer service. Amazon Q in Connect uses real-time conversation with the customer along with relevant company content to automatically recommend what to say or what actions an agent should take to better assist customers.
SageMaker model cards
SageMaker model cards include information about the model such as intended use and risk rating of a model, training details and metrics, evaluation results, and observations. AI service cards provide transparency about AWS AI services’ intended use, limitations, and potential impacts
You can use Amazon SageMaker Model Cards to document critical details about your machine learning (ML) models in a single place for streamlined governance and reporting. You can catalog details such as the intended use and risk rating of a model, training details and metrics, evaluation results and observations, and additional call-outs such as considerations, recommendations, and custom information.
AI Service Cards are a form of responsible AI documentation that provides customers with a single place to find information on the intended use cases and limitations, responsible AI design choices, and deployment and performance optimization best practices for AI services from AWS.
Token
A token is a sequence of characters that a model can interpret or predict as a single unit of meaning
A sequence of characters that a model can interpret or predict as a single unit of meaning. For example, with text models, a token could correspond not just to a word, but also to a part of a word with grammatical meaning (such as “-ed”), a punctuation mark (such as “?”), or a common phrase (such as “a lot”).
Embedding
Embedding is a vector of numerical values that represents condensed information obtained by transforming input into that vector
The process of condensing information by transforming input into a vector of numerical values, known as the embeddings, in order to compare the similarity between different objects by using a shared numerical representation. For example, sentences can be compared to determine the similarity in meaning, images can be compared to determine visual similarity, or text and image can be compared to see if they’re relevant to each other.
Knowledge Bases for Amazon Bedrock
Use Knowledge Bases for Amazon Bedrock to supplement contextual information from the company’s private data to the FM using Retrieval Augmented Generation (RAG)
With the comprehensive capabilities of Amazon Bedrock, you can experiment with a variety of top FMs, customize them privately with your data using techniques such as fine-tuning and retrieval-augmented generation (RAG), and create managed agents that execute complex business tasks—from booking travel and processing insurance claims to creating ad campaigns and managing inventory—all without writing any code.
Using Knowledge Bases for Amazon Bedrock, you can provide foundation models with contextual information from your company’s private data for Retrieval Augmented Generation (RAG), enhancing response relevance and accuracy. This fully managed feature handles the entire RAG workflow, eliminating the need for custom data integrations and management.
via - https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html
Retrieval Augmented Generation (RAG)
Retrieval-Augmented Generation (RAG) is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response. Large Language Models (LLMs) are trained on vast volumes of data and use billions of parameters to generate original output for tasks like answering questions, translating languages, and completing sentences. RAG extends the already powerful capabilities of LLMs to specific domains or an organization’s internal knowledge base, all without the need to retrain the model. It is a cost-effective approach to improving LLM output so it remains relevant, accurate, and useful in various contexts.
Reinforcement learning from human feedback (RLHF)
Reinforcement learning from human feedback (RLHF) is a machine learning (ML) technique that uses human feedback to optimize ML models to self-learn more efficiently. Reinforcement learning (RL) techniques train software to make decisions that maximize rewards, making their outcomes more accurate. RLHF incorporates human feedback in the rewards function, so the ML model can perform tasks more aligned with human goals, wants, and needs. RLHF is used throughout generative artificial intelligence (generative AI) applications, including in large language models (LLM).
Small language model (SLM)
A small language model (SLM) is an AI model designed to process and generate human language, with a compact architecture, fewer parameters, and lower computational requirements compared to large language models (LLMs).
A small language model (SLM) optimized for deployment on edge devices is specifically designed to be lightweight, efficient, and capable of running on devices with limited computational resources. Deploying the model directly on the edge device eliminates the need for network communication with a central server, thereby achieving the required low-latency inference needed for real-time IoT applications.
https://aws.amazon.com/about-aws/whats-new/2024/05/amazon-bedrock-mistral-small-foundation-model/
Edge device
In computer networking, an edge device is a device that provides an entry point into enterprise or service provider core networks.[1] Examples include routers,[2] routing switches, integrated access devices (IADs), multiplexers, and a variety of metropolitan area network (MAN) and wide area network (WAN) access devices. Edge devices also provide connections into carrier and service provider networks. An edge device that connects a local area network to a high speed switch or backbone (such as an ATM switch) may be called an edge concentrator.[3]
Central API
Central API and asynchronous inference endpoint introduces network latency.
using a central API with asynchronous inference endpoints still involves network communication that can result in latency