S3 Flashcards

1
Q

What is Amazon S3?

A
  • Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance.
  • The file stored in S3 is referred to as objects.
  • It is also seen as a database that stores objects in key-value pairs. The object ID is the key, and the object is the value.
  • The buckets are defined at the regional level
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How Amazon S 3 security is handled

A
  • IAM policy- which A PI calls should be allowed for a specific user from IAM
  • Resource based: Bucket Policies, Object Access Control List, Bucket Access Control List
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the key points of hosting a website on Amazon S3?

A
  • The website must be static website and it should be accessible on the Internet
  • the bucket should allow public reads in order for external users to access it’s content.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

can you version Amazon S3 files

A

Yes, It can be enabled at the bucket level. same key can be used to access the latest version of the object.

Suspending versions does not delete the previous versions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How to enable S3 replication?

A

You must enable versioning in the source and destination buckets to enable the application. There are two types of applications:

  1. cross-region replication (CRR)
  2. same region application (SRR)
  • The buckets can be in different accounts.
  • Copy is asynchronous.
  • Must give proper IAM permissions to S3
  • After you enable the replication, only new objects will be replicated. If you need to replicate, the existing objects use S3 batch replication.
  • for the delete operation, you can replicate a marker from source to target (optional setting). Deletion with version ID is not replicated.

Use case: Compliance, lower latency access, replication across accounts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How many S3 storage classes are there?

A
  1. S3 standard - General Purpose
  2. S3 standard - infrequent access (IA)
  3. S3 1 zone infrequent access
  4. S3 Glacier Instant Retrieval
  5. S3 Glacier Flexible Retrieval
  6. S3 Glacier deep archive
  7. S3 intelligent tiering

Objects can move between classes manually or using S3 Lifecycle configurations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is S3 standard storage class?

A

It is used for frequently accessed data. It has low latency and high throughput. It can sustain two concurrent facility failures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is S3 infrequent access?

A

Use for the data that is less frequently accessed but requires rapid access when needed. It is less expensive compared to the S3 standard.

there are two infrequent options:

  1. amazon S 3 standard infrequent access. it is used for disaster recovery and backup N
  2. S3 one zone infrequent access is limited to a single AZ, and you lose the data if AZ is destroyed. It is used for storing secondary back copies of on-prem data or data you can recreate.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Amazon S3 glacier storage classes?

A

It is low cost objective storage meant for archiving and backups. you pay for storage and object retrieval. there are three subclasses in glacier storage:

  1. Amazon S3 glacier Instant Retrieval: It offers millisecond retrieval and it is great for data accessed once a quarter. the minimum storage duration is 90 days
  2. Amazon s3 glacier flexible retrieval: the expedited retrieval is between one to five minutes, the standard retrieval is between 3 to 5 hours. For bulk the retrieval is in between 5 to 12 hours - it’s free. The minimum storage duration is 90 days.
  3. Amazon S3 Glacier Deep Archive- for long term storage: the standard retrieval is is in 12 hours and the bulk is in 48 hours. minimum storage duration is of 180 days.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is S3 intelligent tiering?

A

Amazon S3 Intelligent-Tiering is the only cloud storage class that delivers automatic storage cost savings when data access patterns change without performance impact or operational overhead.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In which order objects can move from one tier to another?

A

you can transition objects between storage classes as shown in the diagram. The moment of objects can be done automatically by using lifecycle rules.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is Amazon S3 lifecycle rules?

A

An S3 Lifecycle configuration is a set of rules that define actions that Amazon S3 applies to a group of objects. There are two types of actions:

Transition actions – These actions define when objects transition to another storage class. For example, you might choose to transition objects to the S3 Standard-IA storage class 30 days after creating them, or archive objects to the S3 Glacier Flexible Retrieval storage class one year after creating them. For more information, see Using Amazon S3 storage classes.

There are costs associated with lifecycle transition requests. For pricing information, see Amazon S3 pricing.

Expiration actions – These actions define when objects expire. Amazon S3 deletes expired objects on your behalf.

Lifecycle expiration costs depend on when you choose to expire objects. For more information, see Expiring objects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Is there a zero-day life cycle policy?

A

If you set a storage class equal to 0 days, information will be immediately sent to S3 Glacier. It is of use when information is rarely accessed in everyday life, but its storage life is limited.

Though it might seem that uploading data to S3 first and going with it to Glacier afterward might be more expensive, AWS has ensured that this exact scenario leads to no more expenses than direct Glacier upload.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is requester pays?

A

In general, bucket owners pay for all Amazon S3 storage and data transfer costs that are associated with their bucket. However, you can configure a bucket to be a Requester Pays bucket. With Requester Pays buckets, the requester instead of the bucket owner pays the cost of the request and the data download from the bucket. The bucket owner always pays the cost of storing data.

The requestor cannot be anonymous and must be authenticated in AWS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is S3 event Notification?

A

You can use the Amazon S3 Event Notifications feature to receive notifications when certain events happen in your S3 bucket. To enable notifications, add a notification configuration that identifies the events that you want Amazon S3 to publish.

The events can be objects removed, replicated, etc. The events can be filtered based on the object names.

The events are sent to SNS, SQS lambda functions, and event bridge.

Even bridge can call over 18 AWS services to handle the event.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is S 3 multipart upload?

A

Multipart upload allows you to upload a single object as a set of parts. Each part is a contiguous portion of the object’s data. You can upload these object parts independently and in any order. If transmission of any part fails, you can retransmit that part without affecting other parts. After all parts of your object are uploaded, Amazon S3 assembles these parts and creates the object. it is recommended for files that is greater than 100 MB. It must be used for files greater than 5GB.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is S3 Transfer Acceleration?

A

Amazon S3 Transfer Acceleration can speed up content transfers to and from Amazon S3 by as much as 50-500% for long-distance transfer of larger objects. Customers who have either web or mobile applications with widespread users or applications hosted far away from their S3 bucket can experience long and variable upload and download speeds over the Internet. S3 Transfer Acceleration (S3TA) reduces the variability in Internet routing, congestion and speeds that can affect transfers, and logically shortens the distance to S3 for remote applications. S3TA improves transfer performance by routing traffic through Amazon CloudFront’s globally distributed Edge Locations and over AWS backbone networks, and by using network protocol optimizations.

18
Q

What is s3 byte range fetches?

A

Using the Range HTTP header in a GET Object request, you can fetch a byte-range from an object, transferring only the specified portion. You can use concurrent connections to Amazon S3 to fetch different byte ranges from within the same object. This helps you achieve higher aggregate throughput versus a single whole-object request. Fetching smaller ranges of a large object also allows your application to improve retry times when requests are interrupted. For more information, see Getting Objects.

Typical sizes for byte-range requests are 8 MB or 16 MB.

This can also help in accessing the header part of the file to gather the high level information about it.

19
Q

What is S3 select?

A

S3 Select is a feature of S3 that lets you specify targeted portions of an S3 object to retrieve and return to you rather than returning the entire contents of the object. You can use basic SQL expressions to select certain columns and filter for particular records in your structured file.

It makes object information retrieval 400% faster and it’s 80% cheaper

20
Q

What is S3 batch operations?

A

S3 Batch Operations is a managed solution for performing storage actions like copying and tagging objects at scale, whether for one-time tasks or for recurring, batch workloads. S3 Batch Operations can perform actions across billions of objects and petabytes of data with a single request. The use cases are:

  1. copy objects between S3 buckets as a batch operation.
  2. something that can come up in exam, is to encrypt all the un-encrypted objects in your S3 buckets.
  3. Modify ACLs, or tags.
  4. Restore many objects at a time from S3 Glacier.
  5. Invoke a Lambda function

Perform whatever custom action you want on every object from your S3 Batch Operation. So, the idea is that you can do whatever operation you want on the list of objects.

21
Q

What is S3 SSE-S3 Encryption?

A

When you use server-side encryption with Amazon S3 managed keys (SSE-S3), each object is encrypted with a unique key. As an additional safeguard, it encrypts the key itself with a root key that it regularly rotates. Amazon S3 server-side encryption uses one of the strongest block ciphers available, 256-bit Advanced Encryption Standard (AES-256) GCM, to encrypt your data.

22
Q

What is Amazon S3 SSE - KMS encryption?

A

The encryption is done by using a key managed by AWS KMS. With KMS, you get the advantage of user control audit key usage using cloud trail. The object is encrypted on the server side.

If you are using SSE KMS, you may be impacted by KMS limit. When you upload object, it calls GenerateDataKey KMS API. when you download it calls decrypt KMS API. Every call is counted toward KMS quota per second (5500, 10000, 30000 req/s based on region)

23
Q

What is Amazon S3 client-side encryption?

A

Client-side encryption is the act of encrypting your data locally to ensure its security as it passes to the Amazon S3 service. The Amazon S3 service receives your encrypted data; it does not play a role in encrypting or decrypting it.

24
Q

How to enforce S3 Encryption?

A

One way to enforce encryption is to use a bucket policy. The bucket policy will refuse any PUT API call That does not have the specified encryption header.

Another way is to use default encryption option in S3

25
Q

What is cross origin resource sharing (CORS)?

A

Cross-origin resource sharing (CORS) defines a way for client web applications that are loaded in one domain to interact with resources in a different domain. With CORS support, you can build rich client-side web applications with Amazon S3 and selectively allow cross-origin access to your Amazon S3 resources.

26
Q

How CORS is done in S3?

A
27
Q

what is s3 MFA delete?

A

MFA delete can help prevent accidental bucket deletions by requiring the user who initiates the delete action to prove physical possession of an MFA device with an MFA code and adding an extra layer of friction and security to the delete action.

MFA will be required to:

  1. permanently delete an object version.
  2. suspend versioning on the bucket

To use MFA delete, versioning must be enabled on the bucket

Only bucket owners can enable or disable MFA delete.

28
Q

What is S3 access log?

A

Server access logging provides detailed records for the requests that are made to an Amazon S3 bucket. Server access logs are useful for many applications. For example, access log information can be useful in security and access audits.

Never ever save the access log on the same bucket.

29
Q

What is amazon s3 pre-signed URL?

A

Pre-signed URLs are used to provide short-term access to a private object in your S3 bucket. They work by appending an AWS Access Key, expiration time, and Sigv4 signature as query parameters to the S3 object. There are two common use cases when you may want to use them:

  1. Simple, occasional sharing of private files.
  2. Frequent, programmatic access to view or upload a file in an application

Examples:

  1. Only logged-in users can download a premium video from your S3 bucket.
  2. allow an ever-changing list of users to download files by generating URLs dynamically.
  3. Allow a user to temporarily upload a file to a precise location in your S3 bucket.
30
Q

What is S3 Glacier Vault Lock?

A

S3 Glacier Vault Lock helps you to easily deploy and enforce compliance controls for individual S3 Glacier vaults with a Vault Lock policy. You can specify controls such as “write once read many” (WORM) in a Vault Lock policy and lock the policy from future edits.

31
Q

What is S3 Glacier Vault Lock?

A

S3 Glacier Vault Lock helps you to easily deploy and enforce compliance controls for individual S3 Glacier vaults with a Vault Lock policy. You can specify controls such as “write once read many” (WORM) in a Vault Lock policy and lock the policy from future edits.

32
Q

What is S3 object lock?

A

With S3 Object Lock, you can store objects using a write-once-read-many (WORM) model. Object Lock can help prevent objects from being deleted or overwritten for a fixed amount of time or indefinitely.

There are two types of retention modes.

  1. Compliance:
    1. Object versions can’t be overwritten or deleted by any user, including the root user
    2. Objects retention modes can’t be changed, and retention periods can’t be shortened
  2. Governance:
    1. Most users can’t overwrite or delete an object version or alter its lock settings
    2. Some users have special permissions to change the retention or delete the object

Retention Period:

  1. Retention Period: protect the object for a fixed period. It can be extended
  2. Legal Hold:
    • protect the object indefinitely, independent from the retention period
    • can be freely placed and removed using the s3:PutObjectLegalHold IAM permission
33
Q

What is S3 access points?

A

Customers increasingly use Amazon S3 to store shared data sets, where data is aggregated and accessed by different applications, teams and individuals, whether for analytics, machine learning, real-time monitoring, or other data lake use cases. Managing access to this shared bucket requires a single bucket policy that controls access for dozens to hundreds of applications with different permission levels. As an application set grows, the bucket policy becomes more complex, time consuming to manage, and needs to be audited to make sure that changes don’t have an unexpected impact on another application.

Amazon S3 Access Points, a feature of S3, simplify data access for any AWS service or customer application that stores data in S3. With S3 Access Points, customers can create unique access control policies for each access point to easily control access to shared datasets.

34
Q

What is SNOW Cone?

A

AWS Snowcone

  • AWS Snowcone is portable, rugged, and secure that provides edge computing and data transfer devices.
  • Snowcone can be used to collect, process, and move data to AWS, either offline by shipping the device, or online with AWS DataSync.
  • AWS Snowcone stores data securely in edge locations, and can run edge computing workloads that use AWS IoT Greengrass or EC2 instances.
  • Snowcone devices are small and weigh 4.5 lbs. (2.1 kg), so you can carry one in a backpack or fit it in tight spaces for IoT, vehicular, or even drone use cases.
35
Q

What is Snowball Edge?

A

AWS Snowball Edge

  • AWS Snowball Edge is a data migration and edge computing device that comes in two device options:
    • Compute Optimized
      • Snowball Edge Compute Optimized devices provide 52 vCPUs, 42 terabytes of usable block or object storage, and an optional GPU for use cases such as advanced machine learning and full-motion video analysis in disconnected environments.
    • Storage Optimized.
      • Snowball Edge Storage Optimized devices provide 40 vCPUs of compute capacity coupled with 80 terabytes of usable block or S3-compatible object storage.
      • It is well-suited for local storage and large-scale data transfer.
  • Customers can use these two options for data collection, machine learning and processing, and storage in environments with intermittent connectivity (such as manufacturing, industrial, and transportation) or in extremely remote locations (such as military or maritime operations) before shipping it back to AWS.
  • Snowball devices may also be rack-mounted and clustered together to build larger, temporary installations.
36
Q

What is SnowMobile?

A
  • AWS Snowmobile moves up to 100 PB of data in a 45-foot long ruggedized shipping container and is ideal for multi-petabyte or Exabyte-scale digital media migrations and data center shutdowns.
  • A Snowmobile arrives at the customer site and appears as a network-attached data store for more secure, high-speed data transfer.
  • After data is transferred to Snowmobile, it is driven back to an AWS Region where the data is loaded into S3.
  • Snowmobile is tamper-resistant, waterproof, and temperature controlled with multiple layers of logical and physical security – including encryption, fire suppression, dedicated security personnel, GPS tracking, alarm monitoring, 24/7 video surveillance, and an escort security vehicle during transit.
37
Q

What is AWS OpsHub?

A

AWS OpsHub for Snow Family helps you to manage devices and local AWS services. You use AWS OpsHub on a client computer to perform tasks such as unlocking and configuring single or clustered devices, transferring files, and launching and managing instances running on Snow Family devices.

38
Q

Can a snowball be imported into glacier?

A

Snowball cannot be imported into Glacier directly. You must import it into S3 first and then use lifecycle policy to move the data to Glacier.

39
Q

What is Amazon FSx?

A

Amazon FSx makes it easy and cost effective to launch, run, and scale feature-rich, high-performance file systems in the cloud. It supports a wide range of workloads with its reliability, security, scalability, and broad set of capabilities. Amazon FSx is built on the latest AWS compute, networking, and disk technologies to provide high performance and lower TCO. And as a fully managed service, it handles hardware provisioning, patching, and backups – freeing you up to focus on your applications, your end users, and your business.

40
Q

What is Amazon FSx for windows file server?

A
  1. FSx for Windows is a fully managed Windows file system share drive
  2. Supports SMB protocol & Windows NTFS
  3. Microsoft Active Directory integration, ACLs, user quotas
  4. Can be mounted on Linux EC2 instances
  5. Supports Microsoft’s Distributed File System (DFS) Namespaces (group files across multiple FS)
  6. Scale up to 10s of GB/s, millions of IOPS, 100s PB of data
  7. Storage Options:
    * SSD – latency sensitive workloads (databases, media processing, data analytics, …)
    * HDD – broad spectrum of workloads (home directory, CMS, …)
    * Can be accessed from your on-premises infrastructure (VPN or Direct Connect)
    * Can be configured to be Multi-AZ (high availability)
    * Data is backed-up daily to S3
41
Q

What is AWS FSx For Luster?

A

Lustre is a type of parallel distributed file system, for large-scale computing
1. The name Lustre is derived from “Linux” and “cluster
2. * Machine Learning, High Performance Computing (HPC) - TIP for Exam
3. Video Processing, Financial Modeling, Electronic Design Automation
4. Scales up to 100s GB/s, millions of IOPS, sub-ms latencies
5. Storage Options:
* SSD – low-latency, IOPS intensive workloads, small & random file operations
* HDD – throughput-intensive workloads, large & sequential file operations
6. Seamless integration with S3
7. Can “read S3” as a file system (through FSx)
8. Can write the output of the computations back to S3 (through FSx)
9. Can be used from on-premises servers (VPN or Direct Connect)

42
Q

What is AWS Storage Gateway?

A

AWS Storage Gateway is a hybrid cloud storage service that gives you on-premises access to virtually unlimited cloud storage. Storage Gateway provides a standard set of storage protocols such as iSCSI, SMB, and NFS, which allow you to use AWS storage without rewriting your existing applications.
1. disaster recovery
2. backup & restore
3. tiered storage
4. on-premises cache & low-latency files access