Domain 4: ML Implementation and Operations Flashcards
_____is a service that provides a record of actions taken by a user, role, or an AWS service. It simplifies compliance audits, security analysis, and operational troubleshooting by enabling event history, which allows you to view, search, and download recent AWS account activity.
AWS CloudTrail
Event history: View the most recent account activity across your AWS infrastructure and troubleshoot operational issues.
CloudTrail Insights: Automatic detection of unusual activity in your account.
Data events: Record API calls made to specific AWS services such as Amazon S3 object-level APIs or AWS Lambda function execution APIs.
Management events: Record API calls that manage the AWS resources.
Key features of CloudTrail
_____ keeps an eye on every API call made to your AWS account and delivers a log file to an Amazon S3 bucket that you specify. These logs include details such as the identity of the API caller, the time of the API call, the source IP address, and the request parameters.
CloudTrail
_____ is a monitoring and observability service for AWS cloud resources and the applications you run on AWS. It can monitor AWS resources, such as EC2 instances, Amazon DynamoDB tables, and Lambda functions, and you can collect and access all your performance and operational data in the form of logs and metrics from a single platform.
Amazon CloudWatch
Metrics: Collect and store key metrics, which are variables you can measure for your resources and applications.
Logs: Collect, monitor, and analyze log files from different AWS services.
Alarms: Watch for specific metrics and automatically react to changes.
Events: Respond to state changes in your AWS resources with EventBridge.
Key features of CloudWatch
This service allows you to set alarms and automatically react to changes in your AWS resources, and it also integrates with Amazon SNS to notify you when certain thresholds are breached.
CloudWatch
Enable
Choose events
Specify S3 bucket
Turn on insights
How to get started w/ CloudTrail monitoring
Set up metrics
Create alarms
Configure logging
Design dashboard
How to implement monitoring solutions with CloudWatch
To effectively monitor for errors and anomalies within your machine learning environment, you could set up a combination of _____ and _____.
CloudTrail and CloudWatch
By deploying applications across multiple Availability Zones, you can protect your applications from the failure of a single location.
High Availability
Multi-Region deployments can provide a backup in case of a regional service disruption.
Fault Tolerance
Different regions can serve users from geographically closer endpoints, reducing latency and improving the user experience.
Scalability
For machine learning applications, having data processing and storage close to the data sources can reduce transfer times and comply with data sovereignty laws.
Data Locality
One or more discrete data centers within a region with redundant power, networking, and connectivity. They are physically separated by a meaningful distance, many kilometers, from any other.
Availability zone
You can deploy machine learning models using Amazon EC2 instances configured with _____, which can launch instances across multiple Availability Zones to ensure your application can withstand the loss of an AZ.
Auto Scaling
For databases backing machine learning applications, a Multi-AZ deployment with _____ can provide high availability and automatic failover support.
Amazon RDS
Deploying applications _____ can protect against regional outages and provide geographic redundancy.
across multiple AWS Regions
_____ allows you to replicate data between distant AWS Regions.
S3 cross-region replication (CRR)
_____ can route traffic to different regions based on geographical location, which can reduce latency for end-users, and its Geoproximity routing lets you balance traffic loads across multiple regions.
Amazon Route 53
Test Failover Mechanisms: Regularly test your failover to ensure that the systems switch to new regions or zones without issues.
Data Synchronization: Keep data synchronized across regions, considering the cost and traffic implications.
Latency: Use services such as Amazon CloudFront to cache data at edge locations and reduce latency.
Compliance and Data Residency: Be aware of compliance requirements and data residency regulations that may impact data storage and transfer.
Cost Management: Consider the additional costs associated with cross-region data transfer and storage.
Best Practices for Multi-Region and Multi-AZ Deployments
A _____ can be used to package and deploy machine learning applications consistently across different environments. By containerizing machine learning applications, you ensure that the application runs the same way, regardless of where it is deployed.
Docker
_____ provide a lightweight, standalone, and executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries, and settings.
Docker containers
Docker containers can be created and managed using services like _____.
Amazon Elastic Container Service (ECS), Amazon Elastic Kubernetes Service (EKS), or even directly on EC2 instances
A text document that contains all the commands a user could call on the command line to assemble an image.
Dockerfile