AWS Solutions AA Exam Flashcards

1
Q

What is a NAT gateway used for?

A

You can use a Network Address Translation gateway (NAT gateway) to enable instances in a private subnet to connect to the internet or other AWS services, but prevent the internet from initiating a connection with those instances. To create a NAT gateway, you must specify the public subnet in which the NAT gateway should reside.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How to configure a NAT gateway?

A

You must also specify an Elastic IP address to associate with the NAT gateway when you create it. The Elastic IP address cannot be changed after you associate it with the NAT Gateway. After you’ve created a NAT gateway, you must update the route table associated with one or more of your private subnets to point internet-bound traffic to the NAT gateway. This enables instances in your private subnets to communicate with the internet. If you no longer need a NAT gateway, you can delete it. Deleting a NAT gateway disassociates its Elastic IP address, but does not release the address from your account.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do NAT gateways work in AZs?

A

Each NAT gateway is created in a specific Availability Zone and implemented with redundancy in that zone.

If you have resources in multiple Availability Zones and they share one NAT gateway, and if the NAT gateway’s Availability Zone is down, resources in the other Availability Zones lose internet access. To create an Availability Zone-independent architecture, create a NAT gateway in each Availability Zone and configure your routing to ensure that resources use the NAT gateway in the same Availability Zone.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does instance placement tenancy work when you have launch configuration in VPC?

A

So, when you’re setting up your launch configuration, if you:

Do nothing about tenancy: Your instance will follow the VPC’s rules. If the VPC is like a neighborhood that’s all private houses (dedicated), then your instance will also be a private house.

If the VPC is like a big apartment complex (default), then your instance will be like an apartment.

Choose “dedicated” tenancy: You’re specifically asking for a private house, no matter what the neighborhood (VPC) usually does.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is instance placement tenancy?

A

When you’re using Amazon’s cloud to create virtual computers (instances), you can also decide how these computers are physically hosted in Amazon’s data centers. This decision is about “instance placement tenancy,” which can sound a bit complicated, but it’s essentially about choosing between two main options for your virtual computer’s physical “neighborhood”:

Shared Housing (default): By default, your virtual computer shares physical hardware with other virtual computers owned by different people. It’s like renting an apartment in a big building where you have your own space, but the building itself is shared with others. This is the most common setup and works well for most needs.

Private House (dedicated): If you want, you can choose to have your virtual computer on its own dedicated physical hardware. This is like having a private house instead of an apartment. No other virtual computers, except yours, will be hosted on this physical machine. This option is used for special situations that require isolation from other users’ computers, often for added security or to meet specific regulatory requirements.

The choice between these two options is controlled in two places:

Launch Configuration: When you’re preparing your recipe (launch configuration) for your virtual computer, you can specify your preference for this physical hosting. If you don’t say anything, Amazon uses the default setting from your virtual private cloud (VPC).

Virtual Private Cloud (VPC): Your VPC settings can also influence this choice. If your VPC is set to dedicated tenancy, any virtual computer created in this VPC will automatically be set up in its own private house, unless specifically overridden.

So, when you’re setting up your launch configuration, if you:

Do nothing about tenancy: Your instance will follow the VPC’s rules. If the VPC is like a neighborhood that’s all private houses (dedicated), then your instance will also be a private house. If the VPC is like a big apartment complex (default), then your instance will be like an apartment.
Choose “dedicated” tenancy: You’re specifically asking for a private house, no matter what the neighborhood (VPC) usually does.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Amazon Aurora Serverless?

A

Amazon Aurora Serverless is an on-demand, auto-scaling configuration for Amazon Aurora (MySQL-compatible and PostgreSQL-compatible editions), where the database will automatically start-up, shut down, and scale capacity up or down based on your application’s needs. It enables you to run your database in the cloud without managing any database instances. It’s a simple, cost-effective option for infrequent, intermittent, or unpredictable workloads. You pay on a per-second basis for the database capacity you use when the database is active and migrate between standard and serverless configurations with a few clicks in the Amazon RDS Management Console.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Amazon DynamoDB?

A

Amazon DynamoDB is a key-value and document database that delivers single-digit millisecond performance at any scale. It’s a fully managed, multi-region, multi-master, durable database with built-in security, backup and restore, and in-memory caching for internet-scale applications. DynamoDB can handle more than 10 trillion requests per day and can support peaks of more than 20 million requests per second.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Amazon ElastiCache?

A

Amazon ElastiCache allows you to set up popular open-Source compatible in-memory data stores in the cloud. You can build data-intensive apps or boost the performance of your existing databases by retrieving data from high throughput and low latency in-memory data stores such as Redis and Memcached. Elasticache is used as a caching layer. It’s not a fully managed MySQL database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can you remove corrupted data at the earliest from Amazon DynamoDB table?

A

Amazon DynamoDB enables you to back up your table data continuously by using point-in-time recovery (PITR). When you enable PITR, DynamoDB backs up your table data automatically with per-second granularity so that you can restore to any given second in the preceding 35 days.

PITR helps protect you against accidental writes and deletes. For example, if a test script writes accidentally to a production DynamoDB table or someone mistakenly issues a “DeleteItem” call, PITR has you covered.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does a recovered EC2 instance have the same with the terminated instance?

A

A recovered instance is identical to the original instance, including the instance ID, private IP addresses, Elastic IP addresses, and all instance metadata. If the impaired instance is in a placement group, the recovered instance runs in the placement group. If your instance has a public IPv4 address, it retains the public IPv4 address after recovery. During instance recovery, the instance is migrated during an instance reboot, and any data that is in-memory is lost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do security groups work in AWS?

A

Security groups are stateful, so allowing inbound traffic to the necessary ports enables the connection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do Network Access Control List (Network ACL) work of the subnet of EC2?

A

Network ACLs are stateless, so you must allow both inbound and outbound traffic.

To enable the connection to a service running on an instance, the associated network ACL must allow both inbound traffic on the port that the service is listening on as well as allow outbound traffic from ephemeral ports. When a client connects to a server, a random port from the ephemeral port range (1024-65535) becomes the client’s source port.

The designated ephemeral port then becomes the destination port for return traffic from the service, so outbound traffic from the ephemeral port must be allowed in the network ACL.

By default, network ACLs allow all inbound and outbound traffic. If your network ACL is more restrictive, then you need to explicitly allow traffic from the ephemeral port range.

If you accept traffic from the internet, then you also must establish a route through an internet gateway. If you accept traffic over VPN or AWS Direct Connect, then you must establish a route through a virtual private gateway (VGW).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is scale up in vertical scalability?

A

Vertical scalability means increasing the size of the instance. For example, your application runs on a t2.micro. Scaling up that application vertically means running it on a larger instance such as t2.large. Scaling down that application vertically means running it on a smaller instance such as t2.nano. Scalability is very common for non-distributed systems, such as a database. There’s usually a limit to how much you can vertically scale (hardware limit). In this case, as the instance type was upgraded from t2.nano to u-12tb1.metal, this is a scale up example of vertical scalability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is scale up in horizontal scalability?

A

Horizontal Scalability means increasing the number of instances/systems for your application. When you increase the number of instances, it’s called scale out whereas if you decrease the number of instances, it’s called scale-in. Scale up is used in conjunction with vertical scaling and not with horizontal scaling. Hence this is incorrect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does high availability mean?

A

High availability means running your application/system in at least 2 data centers (== Availability Zones). The goal of high availability is to survive a data center loss. An example of High Availability is when you run instances for the same application across multi AZ. This option has been added as a distractor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the best relational database in AWS in terms of scalability with high fault tolerancy?

A

Aurora features a distributed, fault-tolerant, and self-healing storage system that is decoupled from compute resources and auto-scales up to 128 TiB per database instance. It delivers high performance and availability with up to 15 low-latency read replicas, point-in-time recovery, continuous backup to Amazon Simple Storage Service (Amazon S3), and replication across three Availability Zones (AZs).

Since Amazon Aurora Replicas share the same data volume as the primary instance in the same AWS Region, there is virtually no replication lag. The replica lag times are in the 10s of milliseconds (compared to the replication lag of seconds in the case of MySQL read replicas). Therefore, this is the right option to ensure that the read replicas lag no more than 1 second behind the primary instance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Which AWS offering can be considered to decouple a monolith architecture to sort out messaging between microservices?

A

Amazon Simple Queue Service (Amazon SQS) is a fully managed message queuing service that enables you to decouple and scale microservices, distributed systems, and serverless applications. Amazon SQS eliminates the complexity and overhead associated with managing and operating message-oriented middleware and empowers developers to focus on differentiating work. Using SQS, you can send, store, and receive messages between software components at any volume, without losing messages or requiring other services to be available.

Use Amazon SQS to transmit any volume of data, at any level of throughput, without losing messages or requiring other services to be available. Amazon SQS lets you decouple application components so that they run and fail independently, increasing the overall fault tolerance of the system. Multiple copies of every message are stored redundantly across multiple availability zones so that they are available whenever needed. Being able to store the messages and replay them is a very important feature in decoupling the system architecture, as is needed in the current use case.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Why cant we use AWS EventBridge for async messaging?

A

This event-based service is extremely useful for connecting non-AWS SaaS (Software as a Service) services to AWS services. With Amazon Eventbridge, the downstream application would need to immediately process the events whenever they arrive, thereby making it a tightly coupled scenario. Hence, this option is not correct.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a good AWS relational database solution to minimize data loss and storing every transaction on at least two nodes?

A

Set up an Amazon RDS MySQL DB instance with Multi-AZ functionality enabled to synchronously replicate the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Can you route traffic to 3rd party websites with alias record?

A

Alias records let you route traffic to selected AWS resources, such as Amazon CloudFront distributions and Amazon S3 buckets. They also let you route traffic from one record in a hosted zone to another record. 3rd party websites do not qualify for these as we have no control over those. ‘Alias record’ cannot be used to map one domain name to another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How does CNAME record work?

A

A CNAME record maps DNS queries for the name of the current record, such as acme.example.com, to another domain (example.com or example.net) or subdomain (acme.example.com or zenith.example.org).

CNAME records can be used to map one domain name to another. Although you should keep in mind that the DNS protocol does not allow you to create a CNAME record for the top node of a DNS namespace, also known as the zone apex. For example, if you register the DNS name example.com, the zone apex is example.com. You cannot create a CNAME record for example.com, but you can create CNAME records for www.example.com, newproduct.example.com, and so on.

Imagine you own a domain myshop.online, and you have created various online platforms for different purposes, such as a blog (blog.myshop.online), a support section (support.myshop.online), and a store (store.myshop.online). Now, you want all these services to be hosted on another domain you have, say, services.digital.

A CNAME (Canonical Name) record comes into play here. It acts like a redirect or alias from one domain to a different domain or subdomain. So, you can set up CNAME records for each of your subdomains to point to services.digital or its respective subdomains, like this:

A CNAME record for blog.myshop.online points to blog.services.digital.
A CNAME record for support.myshop.online points to support.services.digital.
A CNAME record for store.myshop.online points to store.services.digital.
This setup means whenever someone visits blog.myshop.online, the DNS system will see the CNAME record and take them to blog.services.digital instead, without the visitor seeing the change in the domain name.

However, there’s a limitation. If you wanted to point your main domain myshop.online directly to another domain using a CNAME record, you wouldn’t be allowed. The DNS standards prevent you from creating a CNAME record for the domain’s root level, also known as the zone apex. So, myshop.online cannot have a CNAME record, but any subdomains like www.myshop.online can.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is good about Amazon Aurora Global Database?

A

An Aurora global database provides more comprehensive failover capabilities than the failover provided by a default Aurora DB cluster. By using an Aurora global database, you can plan for and recover from disaster fairly quickly. Recovery from disaster is typically measured using values for RTO and RPO.

Recovery time objective (RTO) – The time it takes a system to return to a working state after a disaster. In other words, RTO measures downtime. For an Aurora global database, RTO can be in the order of minutes.

Recovery point objective (RPO) – The amount of data that can be lost (measured in time). For an Aurora global database, RPO is typically measured in seconds.

With an Aurora global database, you can choose from two different approaches to failover:

Managed planned failover – This feature is intended for controlled environments, such as disaster recovery (DR) testing scenarios, operational maintenance, and other planned operational procedures. Managed planned failover allows you to relocate the primary DB cluster of your Aurora global database to one of the secondary Regions. Because this feature synchronizes secondary DB clusters with the primary before making any other changes, RPO is 0 (no data loss).

Unplanned failover (“detach and promote”) – To recover from an unplanned outage, you can perform a cross-Region failover to one of the secondaries in your Aurora global database. The RTO for this manual process depends on how quickly you can perform the tasks listed in Recovering an Amazon Aurora global database from an unplanned outage. The RPO is typically measured in seconds, but this depends on the Aurora storage replication lag across the network at the time of the failure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How can you reduce the cost of Amazon SQS?

A

Amazon SQS provides short polling and long polling to receive messages from a queue. By default, queues use short polling. With short polling, Amazon SQS sends the response right away, even if the query found no messages. With long polling, Amazon SQS sends a response after it collects at least one available message, up to the maximum number of messages specified in the request. Amazon SQS sends an empty response only if the polling wait time expires.

Long polling makes it inexpensive to retrieve messages from your Amazon SQS queue as soon as the messages are available. Using long polling can reduce the cost of using SQS because you can reduce the number of empty receives.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What does Amazon GuardDuty do?

A

Amazon GuardDuty offers threat detection that enables you to continuously monitor and protect your AWS accounts, workloads, and data stored in Amazon S3. GuardDuty analyzes continuous streams of meta-data generated from your account and network activity found in AWS CloudTrail Events, Amazon VPC Flow Logs, and DNS Logs. It also uses integrated threat intelligence such as known malicious IP addresses, anomaly detection, and machine learning to identify threats more accurately.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is Amazon Macie?

A

Amazon Macie is a fully managed data security and data privacy service that uses machine learning and pattern matching to discover and protect your sensitive data on Amazon S3. Macie automatically detects a large and growing list of sensitive data types, including personally identifiable information (PII) such as names, addresses, and credit card numbers. It also gives you constant visibility of the data security and data privacy of your data stored in Amazon S3.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is Cloud Formation stack?

A

AWS CloudFormation stack is a set of AWS resources that are created and managed as a single unit when AWS CloudFormation instantiates a template. A stack cannot be used to deploy the same template across AWS accounts and regions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is Cloud Formation template?

A

AWS Cloudformation template is a JSON or YAML-format, text-based file that describes all the AWS resources you need to deploy to run your application. A template acts as a blueprint for a stack.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is AWS Resource Access Manager (RAM)?

A

AWS Resource Access Manager (AWS RAM) is a service that enables you to easily and securely share AWS resources with any AWS account or within your AWS Organization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is AWS Cloud Formation StackSet?

A

AWS CloudFormation StackSet extends the functionality of stacks by enabling you to create, update, or delete stacks across multiple accounts and regions with a single operation. A stack set lets you create stacks in AWS accounts across regions by using a single AWS CloudFormation template. Using an administrator account of an “AWS Organization”, you define and manage an AWS CloudFormation template, and use the template as the basis for provisioning stacks into selected target accounts of an “AWS Organization” across specified regions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is AWS Database Migration Service?

A

AWS Database Migration Service helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is Amazon Glue?

A

AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics. AWS Glue job is meant to be used for batch ETL data processing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is AWS EMR?

A

Amazon EMR is the industry-leading cloud big data platform for processing vast amounts of data using open source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto. With EMR you can run Petabyte-scale analysis at less than half of the cost of traditional on-premises solutions and over 3x faster than standard Apache Spark. For short-running jobs, you can spin up and spin down clusters and pay per second for the instances used. For long-running workloads, you can create highly available clusters that automatically scale to meet demand. Amazon EMR uses Hadoop, an open-source framework, to distribute your data and processing across a resizable cluster of Amazon EC2 instances.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is Amazon Kinesis Data Streams?

A

Amazon Kinesis Data Streams (KDS) is a massively scalable and durable real-time data streaming service. KDS can continuously capture gigabytes of data per second from hundreds of thousands of sources such as website clickstreams, database event streams, financial transactions, social media feeds, IT logs, and location-tracking events.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is the fully managed NoSQL persistent data store with in-memory caching in AWS?

A

Amazon DynamoDB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is Amazon’s fully managed document database service that supports MongoDB workloads?

A

DocumentDB. Amazon DocumentDB is a fast, scalable, highly available, and fully managed document database service that supports MongoDB workloads. As a document database, Amazon DocumentDB makes it easy to store, query, and index JSON data.

36
Q

What is Amazon’s standart relational db offering?

A

Amazon RDS. Amazon RDS makes it easy to set up, operate, and scale a relational database in the cloud. It provides cost-efficient and resizable capacity while automating time-consuming administration tasks such as hardware provisioning, database setup, patching, and backups.

37
Q

What is AWS Schema Conversion Tool?

A

For such a scenario, the source and target databases engines are different, like in the case of Oracle to Amazon Aurora, Oracle to PostgreSQL, or Microsoft SQL Server to MySQL migrations. In this case, the schema structure, data types, and database code of source and target databases can be quite different, requiring a schema and code transformation before the data migration starts.

That makes heterogeneous migrations a two-step process. First use the AWS Schema Conversion Tool to convert the source schema and code to match that of the target database, and then use the AWS Database Migration Service to migrate data from the source database to the target database. All the required data type conversions will automatically be done by the AWS Database Migration Service during the migration. The source database can be located on your on-premises environment outside of AWS, running on an Amazon EC2 instance, or it can be an Amazon RDS database. The target can be a database in Amazon EC2 or Amazon RDS.

38
Q

What is AWS Snowball Edge?

A

AWS Snowball Edge Storage Optimized is the optimal choice if you need to securely and quickly transfer dozens of terabytes to petabytes of data to AWS. It provides up to 80 TB of usable HDD storage, 40 vCPUs, 1 TB of SATA SSD storage, and up to 40 Gb network connectivity to address large scale data transfer and pre-processing use cases. As each Snowball Edge Storage Optimized device can handle 80TB of data, you can order 10 such devices to take care of the data transfer for all applications. The original Snowball devices were transitioned out of service and AWS Snowball Edge Storage Optimized are now the primary devices used for data transfer. You may see the Snowball device on the exam, just remember that the original Snowball device had 80TB of storage space. AWS Snowball Edge cannot be used for database migrations.

39
Q

What does Amazon S3 Transfer Acceleration do? (Amazon S3TA)

A

Amazon S3 Transfer Acceleration (S3TA) can speed up content transfers to and from Amazon S3 by as much as 50-500% for long-distance transfer of larger objects. Customers who have either web or mobile applications with widespread users or applications hosted far away from their S3 bucket can experience long and variable upload and download speeds over the Internet. S3 Transfer Acceleration (S3TA) reduces the variability in Internet routing, congestion, and speeds that can affect transfers, and logically shortens the distance to S3 for remote applications.

S3TA improves transfer performance by routing traffic through Amazon CloudFront’s globally distributed Edge Locations and over AWS backbone networks, and by using network protocol optimizations.

40
Q

How would you postpone the delivery of new messages to Amazon SQS?

A

Use delay queue

41
Q

What is Connection Draining feature of Elastic Load Balancing?

A

To ensure that Elastic Load Balancing stops sending requests to instances that are de-registering or unhealthy while keeping the existing connections open, use connection draining. This enables the load balancer to complete in-flight requests made to instances that are de-registering or unhealthy. The maximum timeout value can be set between 1 and 3,600 seconds (the default is 300 seconds). When the maximum time limit is reached, the load balancer forcibly closes connections to the de-registering instance.

42
Q

What is Sticky Sessions feature of Elastic Load Balancing?

A

You can use the sticky session feature (also known as session affinity) to enable the load balancer to bind a user’s session to a specific instance. This ensures that all requests from the user during the session are sent to the same instance. Sticky sessions cannot be used to complete in-flight requests made to instances that are de-registering or unhealthy.

43
Q

What is Idle Timeout feature of Elastic Load Balancing?

A

For each request that a client makes through Elastic Load Balancing, the load balancer maintains two connections. The front-end connection is between the client and the load balancer. The back-end connection is between the load balancer and a registered Amazon EC2 instance. The load balancer has a configured “idle timeout” period that applies to its connections. If no data has been sent or received by the time that the “idle timeout” period elapses, the load balancer closes the connection. “Idle timeout” can not be used to complete in-flight requests made to instances that are de-registering or unhealthy.

44
Q

Tell me about AWS Snowball Edge Storage Optimized device?

A

AWS Snowball Edge Storage Optimized is the optimal choice if you need to securely and quickly transfer dozens of terabytes to petabytes of data to AWS. It provides up to 80 TB of usable HDD storage, 40 vCPUs, 1 TB of SATA SSD storage, and up to 40 Gb network connectivity to address large scale data transfer and pre-processing use cases. As each Snowball Edge Storage Optimized device can handle 80TB of data, you can order 10 such devices to take care of the data transfer for all applications. The original Snowball devices were transitioned out of service and Snowball Edge Storage Optimized are now the primary devices used for data transfer. You may see the Snowball device on the exam, just remember that the original Snowball device had 80TB of storage space.

AWS Snowball Edge is suitable for offline data transfers, for customers who are bandwidth constrained or transferring data from remote, disconnected, or austere environments. Therefore, it cannot support automated and accelerated online data transfers.

45
Q

WTF is AWS Transfer Family?

A

The AWS Transfer Family provides fully managed support for file transfers directly into and out of Amazon S3 and Amazon EFS. Therefore, it cannot support migration into the other AWS storage services mentioned in the given use-case (Amazon FSx for Windows File Server).

46
Q

What does File Gateway do?

A

AWS Storage Gateway’s file interface, or file gateway, offers you a seamless way to connect to the cloud to store application data files and backup images as durable objects on Amazon S3 cloud storage. File gateway offers SMB or NFS-based access to data in Amazon S3 with local caching. It can be used for on-premises applications, and for Amazon EC2-based applications that need file protocol access to S3 object storage. Therefore, it cannot support migration into the other AWS storage services mentioned in the given use-case (such as EFS and Amazon FSx for Windows File Server).

47
Q

What is AWS DataSync about?

A

AWS DataSync is an online data transfer service that simplifies, automates, and accelerates copying large amounts of data to and from AWS storage services over the internet or AWS Direct Connect.

AWS DataSync fully automates and accelerates moving large active datasets to AWS, up to 10 times faster than command-line tools. It is natively integrated with Amazon S3, Amazon EFS, Amazon FSx for Windows File Server, Amazon CloudWatch, and AWS CloudTrail, which provides seamless and secure access to your storage services, as well as detailed monitoring of the transfer.

AWS DataSync uses a purpose-built network protocol and scale out architecture to transfer data. A single DataSync agent is capable of saturating a 10 Gbps network link.

AWS DataSync fully automates the data transfer. It comes with retry and network resiliency mechanisms, network optimizations, built-in task scheduling, monitoring via the DataSync API and Console, and Amazon CloudWatch metrics, events, and logs that provide granular visibility into the transfer process. AWS DataSync performs data integrity verification both during the transfer and at the end of the transfer.

48
Q

How does VPC sharing work?

A

VPC sharing (part of Resource Access Manager) allows multiple AWS accounts to create their application resources such as Amazon EC2 instances, Amazon RDS databases, Amazon Redshift clusters, and AWS Lambda functions, into shared and centrally-managed Amazon Virtual Private Clouds (VPCs). To set this up, the account that owns the VPC (owner) shares one or more subnets with other accounts (participants) that belong to the same organization from AWS Organizations. After a subnet is shared, the participants can view, create, modify, and delete their application resources in the subnets shared with them. Participants cannot view, modify, or delete resources that belong to other participants or the VPC owner.

You can share Amazon VPCs to leverage the implicit routing within a VPC for applications that require a high degree of interconnectivity and are within the same trust boundaries. This reduces the number of VPCs that you create and manage while using separate accounts for billing and access control.

49
Q

How to store S3 objects cost-effectively?

A

To manage your S3 objects, so they are stored cost-effectively throughout their lifecycle, configure their Amazon S3 Lifecycle. An S3 Lifecycle configuration is a set of rules that define actions that Amazon S3 applies to a group of objects. There are two types of actions:

Transition actions — Define when objects transition to another storage class. For example, you might choose to transition objects to the S3 Standard-IA storage class 30 days after you created them, or archive objects to the S3 Glacier storage class one year after creating them.

Expiration actions — Define when objects expire. Amazon S3 deletes expired objects on your behalf.

50
Q

How does a security group work?

A

A security group acts as a virtual firewall that controls the traffic for one or more instances. When you launch an instance, you can specify one or more security groups; otherwise, we use the default security group. You can add rules to each security group that allow traffic to or from its associated instances. You can modify the rules for a security group at any time; the new rules are automatically applied to all instances that are associated with the security group. When deciding to allow traffic to reach an instance, all the rules from all the security groups that are associated with the instance are evaluated.

The following are the characteristics of security group rules: 1. By default, security groups allow all outbound traffic. 2. Security group rules are always permissive; you can’t create rules that deny access. 3. Security groups are stateful

51
Q

What is Amazon SageMaker?

A

Amazon SageMaker helps data scientists and developers to prepare, build, train, and deploy high-quality machine learning (ML) models quickly by bringing together a broad set of capabilities purpose-built for ML. A powerful tool, SageMaker is not a docker orchestration service, as required for the use case.

52
Q

What is Amazon MQ?

A

Amazon MQ is a managed message broker service for Apache ActiveMQ that makes it easy to set up and operate message brokers in the cloud. Message brokers allow different software systems–often using different programming languages, and on different platforms–to communicate and exchange information. If an organization is using messaging with existing applications and wants to move the messaging service to the cloud quickly and easily, AWS recommends Amazon MQ for such a use case. Connecting your current applications to Amazon MQ is easy because it uses industry-standard APIs and protocols for messaging, including JMS, NMS, AMQP, STOMP, MQTT, and WebSocket.

Amazon SNS, Amazon SQS, and Amazon Kinesis are AWS’s proprietary technologies and do not come with MQTT compatibility.

53
Q

Can you do caching with AWS Lambda?

A

AWS Lambda has no native in-memory caching capability. AWS Lambda is a serverless compute capacity.

54
Q

What is Amazon API Gateway Caching?

A

Amazon API Gateway is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale. APIs act as the “front door” for applications to access data, business logic, or functionality from your backend services. Using API Gateway, you can create RESTful APIs and WebSocket APIs that enable real-time two-way communication applications. API Gateway supports containerized and serverless workloads, as well as web applications.

You can enable Amazon API caching in Amazon API Gateway to cache your endpoint’s responses. With caching, you can reduce the number of calls made to your endpoint and also improve the latency of requests to your API. When you enable caching for a stage, API Gateway caches responses from your endpoint for a specified time-to-live (TTL) period, in seconds. Amazon API Gateway then responds to the request by looking up the endpoint response from the cache instead of requesting your endpoint. The default TTL value for API caching is 300 seconds. The maximum TTL value is 3600 seconds. TTL=0 means caching is disabled. Using API Gateway Caching feature is the answer for the use case, as we can accept stale data for about 24 hours.

55
Q

What is Warm Standby for disaster recovery?

A

The term warm standby is used to describe a DR scenario in which a scaled-down version of a fully functional environment is always running in the cloud. A warm standby solution extends the pilot light elements and preparation. It further decreases the recovery time because some services are always running. By identifying your business-critical systems, you can fully duplicate these systems on AWS and have them always on. This option is costlier compared to Pilot Light.

56
Q

What is pilot light for disaster recovery?

A

The term pilot light is often used to describe a DR scenario in which a minimal version of an environment is always running in the cloud. The idea of the pilot light is an analogy that comes from the gas heater. In a gas heater, a small flame that’s always on can quickly ignite the entire furnace to heat up a house. This scenario is similar to a backup-and-restore scenario. For example, with AWS you can maintain a pilot light by configuring and running the most critical core elements of your system in AWS. For the given use-case, a small part of the backup infrastructure is always running simultaneously syncing mutable data (such as databases or documents) so that there is no loss of critical data. When the time comes for recovery, you can rapidly provision a full-scale production environment around the critical core. For Pilot light, RPO/RTO is in 10s of minutes, so this is the correct solution.

57
Q

What is backup and restore for disaster recovery?

A

In most traditional environments, data is backed up to tape and sent off-site regularly. If you use this method, it can take a long time to restore your system in the event of a disruption or disaster. Amazon S3 is an ideal destination for backup data that might be needed quickly to perform a restore. Transferring data to and from Amazon S3 is typically done through the network and is therefore accessible from any location. There are many commercial and open-source backup solutions that integrate with Amazon S3. You can use AWS Import/Export to transfer very large data sets by shipping storage devices directly to AWS. For longer-term data storage where retrieval times of several hours are adequate, there is Amazon Glacier, which has the same durability model as Amazon S3. Amazon Glacier is a low-cost alternative starting from $0.01/GB per month. Amazon Glacier and Amazon S3 can be used in conjunction to produce a tiered backup solution. Even though Backup and Restore method is cheaper, it has an RPO in hours, so this option is not the right fit.

58
Q

What is multi-site for disaster recovery?

A

A multi-site solution runs on AWS as well as on your existing on-site infrastructure in an active-active configuration. The data replication method that you employ will be determined by the recovery point that you choose, either Recovery Time Objective (the maximum allowable downtime before degraded operations are restored) or Recovery Point Objective (the maximum allowable time window whereby you will accept the loss of transactions during the DR process). This option is more costly compared to Pilot Light.

59
Q

Is this correct?

By default, AWS Lambda functions always operate from an AWS-owned VPC and hence have access to any public internet address or public AWS APIs. Once an AWS Lambda function is VPC-enabled, it will need a route through a Network Address Translation gateway (NAT gateway) in a public subnet to access public resources

A

yes. AWS Lambda functions always operate from an AWS-owned VPC. By default, your function has the full ability to make network requests to any public internet address — this includes access to any of the public AWS APIs. For example, your function can interact with AWS DynamoDB APIs to PutItem or Query for records. You should only enable your functions for VPC access when you need to interact with a private resource located in a private subnet. An Amazon RDS instance is a good example.

Once your function is VPC-enabled, all network traffic from your function is subject to the routing rules of your VPC/Subnet. If your function needs to interact with a public resource, you will need a route through a NAT gateway in a public subnet.

60
Q

How to manage concurrency in AWS Lambda?

A

Since AWS Lambda functions can scale extremely quickly, this means you should have controls in place to notify you when you have a spike in concurrency. A good idea is to deploy an Amazon CloudWatch Alarm that notifies your team when function metrics such as ConcurrentExecutions or Invocations exceeds your threshold. You should create an AWS Budget so you can monitor costs on a daily basis.

61
Q

How can you facilitate secure end-to-end access to Amazon RDS?

A

Configure Amazon RDS to use SSL for data in transit
You can use Secure Socket Layer / Transport Layer Security (SSL/TLS) connections to encrypt data in transit. Amazon RDS creates an SSL certificate and installs the certificate on the DB instance when the instance is provisioned. For MySQL, you launch the MySQL client using the –ssl_ca parameter to reference the public key to encrypt connections. Using SSL, you can encrypt a PostgreSQL connection between your applications and your PostgreSQL DB instances. You can also force all connections to your PostgreSQL DB instance to use SSL.

62
Q

What would you use if data retention, minimal downtime, and application performance are a priority?

A

Multi-AZ is the best option when data retention, minimal downtime, and application performance are a priority.

Data-loss potential - Low. Multi-AZ provides fault tolerance for every scenario, including hardware-related issues.

Performance impact - Low. Of the available options, Multi-AZ provides the fastest time to recovery, because there is no manual procedure to follow after the process is implemented.

Cost - Low to high. Multi-AZ is the lowest-cost option. Use Multi-AZ when you can’t risk losing data because of hardware failure or you can’t afford the downtime required by other options in your response to an outage.

63
Q

What is AWS Global Accelerator?

A

AWS Global Accelerator is a networking service that helps you improve the availability and performance of the applications that you offer to your global users. AWS Global Accelerator is easy to set up, configure, and manage. It provides static IP addresses that provide a fixed entry point to your applications and eliminate the complexity of managing specific IP addresses for different AWS Regions and Availability Zones.

Associate the static IP addresses provided by AWS Global Accelerator to regional AWS resources or endpoints, such as Network Load Balancers, Application Load Balancers, Amazon EC2 Instances, and Elastic IP addresses. The IP addresses are anycast from AWS edge locations so they provide onboarding to the AWS global network close to your users.

64
Q

Can you use a single Network Load Balancer in across AWS regions?

A

Using a single Network Load Balancer is not possible across AWS regions since an Network Load Balancer is Region bound.

65
Q

How does the default termination policy work in Amazon EC2 Auto Scaling group?

A

The default termination policy is designed to help ensure that your instances span Availability Zones evenly for high availability. The default policy is kept generic and flexible to cover a range of scenarios.

The default termination policy behavior is as follows: 1. Determine which Availability Zones (Azs) have the most instances and at least one instance that is not protected from scale-in. 2. Determine which instances to terminate to align the remaining instances to the allocation strategy for the On-Demand or Spot Instance that is terminating. 3. Determine whether any of the instances use the oldest launch template or configuration: 3.a. Determine whether any of the instances use the oldest launch template unless there are instances that use a launch configuration. 3.b. Determine whether any of the instances use the oldest launch configuration. 4. After applying all of the above criteria, if there are multiple unprotected instances to terminate, determine which instances are closest to the next billing hour.

Per the given use-case, AZs will be balanced first, then the instance with the oldest launch template or launch configuration within the applicable AZ (AZ-B) will be terminated.

66
Q

What is AWS Storage Gateway?

A

AWS Storage Gateway is a hybrid cloud storage service that gives you on-premises access to virtually unlimited cloud storage. The service provides three different types of gateways – Tape Gateway, File Gateway, and Volume Gateway – that seamlessly connect on-premises applications to cloud storage, caching data locally for low-latency access. AWS Storage Gateway cannot be used for distributing files to end-users, so this option is ruled out.

67
Q

What is Amazon S3 Glacier?

A

Amazon S3 Glacier and S3 Glacier Deep Archive are secure, durable, and extremely low-cost Amazon S3 cloud storage classes for data archiving and long-term backup. They are designed to deliver 99.999999999% durability and provide comprehensive security and compliance capabilities that can help meet even the most stringent regulatory requirements. Amazon Glacier is not applicable as the files are frequently requested (Glacier has retrieval times ranging from a few minutes to hours), so this option is also ruled out.

68
Q

how does Auto Scaling group’s target tracking policy work?

A

With target tracking scaling policies, you select a scaling metric and set a target value. Amazon EC2 Auto Scaling creates and manages the CloudWatch alarms that trigger the scaling policy and calculates the scaling adjustment based on the metric and the target value. The scaling policy adds or removes capacity as required to keep the metric at, or close to, the specified target value.

For example, you can use target tracking scaling to:

Configure a target tracking scaling policy to keep the average aggregate CPU utilization of your Auto Scaling group at 50 percent. This meets the requirements specified in the given use-case and therefore, this is the correct option.

69
Q

Can your Auto Scaling group use Amazon Cloudwatch alarm as trigger?

A

Nope

70
Q

What is throttling?

A

Throttling is the process of limiting the number of requests an authorized program can submit to a given operation in a given amount of time.

71
Q

Can Elastic Load Balancer be used to throttle requests (preventing your API from being overwhelmed by requests)?

A

No

72
Q

Can AWS Lambda be used to throttle requests (preventing your API from being overwhelmed by requests)?

A

Additional requests fail when requests come in faster than your Lambda function can scale

73
Q

What can you use to ingest real-time data?

A

Amazon Kinesis Data Streams

74
Q

How does cluster placement group work?

A

Cluster placement groups pack instances close together inside an AZ. For some workloads like High Performance Computing, you need low latency network performance.

75
Q

What is partition placement group?

A

A partition placement group spreads your instances across logical partitions such that groups of instances in one partition do not share the underlying hardware with groups of instances in different partitions. This strategy is typically used by large distributed and replicated workloads, such as Hadoop, Cassandra, and Kafka. A partition placement group can have a maximum of seven partitions per Availability Zone. Since a partition placement group can have partitions in multiple Availability Zones in the same region, therefore instances will not have low-latency network performance. Hence the partition placement group is not the right fit for HPC applications.

76
Q

What is spread placement group?

A

A spread placement group is a group of instances that are each placed on distinct racks, with each rack having its own network and power source. The instances are placed across distinct underlying hardware to reduce correlated failures. You can have a maximum of seven running instances per Availability Zone per group. Since a spread placement group can span multiple Availability Zones in the same Region, therefore instances will not have low-latency network performance. Hence spread placement group is not the right fit for HPC applications.

77
Q

What does Amazon GuardDuty check?

A

Amazon GuardDuty analyzes tens of billions of events across multiple AWS data sources, such as AWS CloudTrail events, Amazon VPC Flow Logs, and DNS logs.

78
Q

What is Amazon DynamoDB Accelerator (DAX)?

A

Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for DynamoDB that delivers up to a 10x performance improvement – from milliseconds to microseconds – even at millions of requests per second.

79
Q

What is Amazon Redshift?

A

Amazon Redshift is a fully-managed petabyte-scale cloud-based data warehouse product designed for large scale data set storage and analysis.

80
Q

What is the shitty thing about spot instances?

A

Spot instances can be taken back by AWS with two minutes of notice

81
Q

Which of the following feature of an Amazon S3 bucket can only be suspended and not disabled once it have been enabled?

A

Versioning

82
Q

What is the minimum storage duration to transition objects from Amazon S3 Standard to Amazon S3 One Zone-IA or Amazon S3 Standard-IA?

A

30 days

83
Q

What is multipart upload?

A

Multipart upload allows you to upload a single object as a set of parts. Each part is a contiguous portion of the object’s data. You can upload these object parts independently and in any order. If transmission of any part fails, you can retransmit that part without affecting other parts. After all parts of your object are uploaded, Amazon S3 assembles these parts and creates the object. In general, when your object size reaches 100 MB, you should consider using multipart uploads instead of uploading the object in a single operation. Multipart upload provides improved throughput, therefore it facilitates faster file uploads.

84
Q

How much time AWS Direct Connect take to establish?

A

Several months…

85
Q
A