Module 4: Adding a storage layer with S3 Flashcards

1
Q

As a cloud engineer working with S3 :

A

Consider access pattern and use cases to choose the correct configuration options, while:

=> Optimising Cost
=> Supporting performance
=> Compliance

And as always, security best practices to protect the resources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Type of storage

A

Block storage
Hierarchical storage (file storage)
Object storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Block storage

A

data stored in fixed block. The application ships the block and store them where is the most efficient. Blocks can be stored accross servers and on different OS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

File storage

A

File storage creates a shared file system. The data is stored in a hierarchical structure. Similar to One drive for example

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Object storage

A

Object storage stores files as object based on attributes and meta data.
An object is data, metadata and a key.
he object key is the unique identifier of the object. When you update an object the entire object is updated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Difference object storage vs block storage

A

In object storage, the entire object must be updated when there is a change to the data. While in Block storage, only part of the data can be changed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Amazon Simple Storage Services (S3)

A

Object storage. Stores massive amounts of unstructured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Type of storage and what is it stored in with S3

A

Object storage, stored in buckets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Max size of a single object?

A

5TB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

indentifier in S3

A

Unique URL for each object (universal namespace)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Component of objects

A

key, version ID, value, metadata and subresources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does immutable means

A

It’s the charateristic of an object. You can’t change part of it you have to change the whole object outside of S3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are buckets for?

A

Container of objects. They organize the Amazon S3 namespace and identify the account in charge of the objects stored in it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Bucket Geography

A

Regional. Objects stored in a bucket never leave the region unless they are explicitely transfered to another region

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is a prefix in a bucket ?

A

similar to a path name, when querying for a prefix it will return the files with a similar path name /photos/2022 for example

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

S3 Benefits

A

Durability
Availability
High performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How Durable is S3?

A

S3 standard storage has 11 nines (99.999999999% of durability) meaning that every year there is a 0.000000001 percent chance of losing an object.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Why is S3 so durable?

A

S3 redundantly stores objects on multiple devices accross multiple facilities in the designated region. It detects and repair failures by comparing files stored in different places. It verifies the integrity of the data by using check sums

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How available is S3?

A

S3 provides 4 nines of availability 99.99%. Meaning the ability to access the data quickly when you want it. Out of 10000 request one would not succeed. It is also scalable (unlimited storage) and gives the ability to encrypt data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Why is S3 high performing?

A

thousands of transactions per second. Scales to high requests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

S3 use cases:

A

1) Host web content: Use high availability and high performance to address fluctuating and potentially high traffic to the data
2) Static site: Simple storage of html files, videos images…
3) financial (or other) analysis: Stor data that other services can use for analysis
4) Disaster recovery

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

S3 example of media hosting

A

S3 caching video content through cloud front to make data available more quickly to a user streaming it, whil another user downloads it directly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Do you need to provision storage for s3?

A

No it scales at need

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Static vs dynamic website

A

on static website the content is statiy and might include client side script. On dynamic website it relies on server side scripts such as PHP, JSP or ASP. S3 does not support server-side scripting. Other AWS services do.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Static website with S3
You can host everything on S3 for a static website no need to have a server, nor a virtual machine
26
S3 for analysis
Load the raw data in a bucket. Use ETL tool to transform it. (provision an EC2 server for that and use Splot fleet or EMR cluster). Return the transformed data to a new bucket. terminate the instance used for ETL. Perform your analysis on the objects stred into the second bucket. (Athena or Quicksight given as example for analysis)
27
S3 for disaster recovery
Store everything in one s3 bucket. Replicate in another one in another region. Additionnally you can move long term data to S3 Glacier.
28
What is cross region replication
Duplicate data in another Bucket, in another region
29
What permission do I need to store something in S3
Write permissions
30
Objects encrypted by default in S3 (true/False)
True and false. Encrypted with server side encryption at upload and decrypted at download but must be enabled
31
4 ways to upload on S3
AWS management console AWS Command Line Interface (AWS CLI) AWS Softwae Devlopment Kit (SDK) Amazon S3 Rest API
32
Uploading an object through the console
Use a wizard (UX) based approach to move data in or out of S including drag and drop option. the limit for the management console is 160 GB.
33
File size limitation to upload through the management console?
160GB
34
For size larger than 160GB?
Use CLI, SDK, or Rest API
35
CLI and S3?
Use command line interface to prompt for an upload or download through a script
36
SDK and S3
Programmatically code the access to S3 in your applications
37
API and S3
Use put request to upload and get request to download. API access can be embeded into application codes.
38
You want to upload a big file to S3 or a file for which you know there is a chance of failure in the upload. What can you use?
Multipart upload
39
What is multipart upload?
The object is separated into mulitple part, reassembled and then stored into a bucket
40
Advantages of multipart upload?
Improved throughput: Uploaded in parallel means quicker storage Recover quickly from network issues Pause and resume upload. Begin an upload as the object is still being built
41
S3 transfer acceleration
Bucket level feature that optimizes transfer speeds . Uses Cloudfront and edge location to optimize the network path
42
Why use trnasfer accelerations?
Your customers upload to a centralized bucket? You transfer gigabytes or terabytes regularly accross continents? You can't use all your bandwith when uploading?
43
About acceleration: The further from the S3 bucket
The better the acceleration
44
AWS transfer Family
Fully managed trasnfer service
45
For what services is AWS transfer Family available?
Amazon S3 Storage Amazon Elastic File System (EFS) Network File System (NFS)
46
Protocoles supported by the Transfer Family
Secure Shell (SSH) File Transfer Protocol Secure File Transfer Protocol Applicability Statement 2 (AS2)
47
Transfer Family Benefits
Managed service that scales you don't need to modify your app or run file transfer protocol infrastructure. everythinng is managed and included into the AWS family Only pay for what you use
48
Use case of transfer family for S3
Data lakes for upload from third parties Subscription based data distribution with customers Internal transfers within org
49
Use case of transfer family for Elastic File System (EFS)
Data distribution Supply Chain Content Management Web Serving application
50
Type of S3 storage
General Purpose Intelligent Tiering Infrequent access Archive
51
S3 General purpose
Suitable for frequent access due to high availability and low latency. Durability accross at least 3 AZ
52
S3 Intelligent tiering
Automatically adjust the storage type of the objects, depending on access frequency to move it to the most cost effective tier.
53
S3 Infrequent access
Standard infrequent access: Similar to Stabdard but run on another cost model. There is a standard 30 days storage fee and the cost is higher to retreive the data. One Zone Infrequent access: low cost opiton, availability and resiliency not so needed. Good choice for secondary back up that you can recreate, or back up from another region.
54
S3 Archive
Glacier instant retrieval: Rarely access data needing still to be accessed rapidly Glacier Flexible retrieval: Needs the possiblity to access large dataset 1-2 times a year. some latency in accessing the data Glacier deep archive: Long term retention for rarely accessed data. Good for customer needing to keep older data for compliance S3 on outpost: S3 infrstructure for data that needs to be stored close to the customer. Kind of renting the hardware and having it on permise. So not quite cloud. (If I get it right)
55
Storage duration charge for Infrequent Access ?
30 days
56
Storage duration charge for Glacier ?
90 days or 180 days for deep archive
57
Number of AZ with S3
>= 3 except S One Zone IA where it's one.
58
S3 and retrieval charges
Retrieval charges per GB retrieved apply except for standard and intelligent tiering
59
What is an S3 lifecycle configuration ?
It's a policy determining the transition of an object from one storage class to another. E.g: No access over the last 30 days => Infrequent access. No access and object last access more than x month ago => deletion Lifecycle transition or expiration have associated costs
60
Type of lifecycle operations ?
Transition Or Expiration At objetct or Bucket level
61
Advantage of lifecycle on S3
The cycling reduces the cost as you pay less for data the further it loses in relevance for you.
62
Lifecycle use case
1. Delete automatically logs after 30 days 2. Documents are stored in standard for 60 days, in infrequently accessed for 1 year, in Glacier for 7 years, then deleted
63
S3 Versioning use case
Protect for accidental overwrite and delets Enables recovery
64
At what level is versioning enabled
At the bucket level
65
How does versioning works
Each object has a Version ID and new publication of the object increment the version id by 1. The previous object is not overwritten. When deleted Amazon simply adds a "deleted" marker. But the object remains.
66
Is versioning enabled by default ?
No
67
What mechanism allows for object retrieval in Versioning?
The version ID
68
Can I recover the object if versioning is Suspended?
No
69
Can I recover a deleted object with versioning ?
Yes
70
What is the cost of versioning
None except for storage cost
71
What issue may you face trying to get an object if the most recent version of it has a delete marker ?
It will not succeed and return a 404 not Found error. If you use a GET request specifying the version then you can access the object
72
How to permanantly delete an object when versioning is active ?
You must be the owner of the bucket and specify the version of the object you want to delete.
73
What is the meaning of CORS
Cross Origin Resource Sharing
74
What is Cross Origin Resource Sharing?
It's an XML document in which are written: The origins: Resources enabled to access your document The Operations (HTTP methods) that will support each origin Additional operation specific information
75
What is Cross Origin Resource Sharing used for
It's a way for client web application to access storage of another application
76
Example of CORS
You have a web font that you use for a website. You want another one to access this resources you create a CORS allowing your second website to access the ressource of the first websote
77
What is strong consistency
A mechanism ensuring that object put on the bucket are consistent with what has been effectively transfered. Read-after-write. Allows to not have to make the checks yourself and provision the infrastructure to do it. It simplifies the migration of on permises workloads. It is by default
78
Outside of the by default capacity of S3 for strong consistency another Amazon Service allows for consistency control. What's the name of the service ?
S3Guard
79
S3 default security configuration
Objects are private and protected by default Encryption is configured by default Default encryption: S3 managed keys (SSE-S3)
80
When sharing S3 access
Manage and control the access. Use least priviledge principle
81
Are new S objects encrypted in transit?
NO but they are encrypted at rest
82
Default S3 encrpytion
S3 Managed Keys (SSE-S3)
83
Can I use another encryption than SSE-S3 ?
Yes use AWS KMS (Key management Services) for: Server side encryption (SSE-KMS) OR Dual layer server side encryption (DSSE-KMS) OR Customer provided key (SSE-C)
84
Can I protect data in transit ?
Yes but yo need client side encryption for that. It happens before being transfered to S3
85
Tools for protecting Buckets and object
Block public access option IAM policies ACL (Access Control Lists) S3 Access Point Preassigned URL (Time limited URL= AWS Trusted advisor (provides bucket permission Check)
86
ACL vs IAM
ACL predates IAM. Prefer IAM or be extra mindful or your ACL setup.
87
Region choice for storage
Data privacy laws and compliance Proximity of users Service availability in the region Cost effectinveness
88
What is S3 inventory for?
Help manage storage Use it to audit and report. You can set up weekly reports and exports in different file formats (CSV, ORC....)
89
Can I query S3 Inventory through a DBS?
Yes with Athena, or redshift for example but also other tools...
90
Default pricing of S3
Pay for what you use: Storage: Per GB of objects stored per month. Different pricing for region and storage class Operation: PUT; COPY; POST; LIST and lifecylce transition
91
S3 has no charge for transfer
1)Out to the internet for up to 100GB a month 2) In from the internet 3) Between S3 Buckets 4) From an S3 bucket to any AWS service with the sae Region 5) Out to Cloud fromt
92
Additional cost for intelligent tiering
Monthly monitoring and automation charge for each object
93
S3 cost depends on
Object size, storage duration, storage class
94
What are Ingest charges
Cost associated to request with PUT, COPY, POST or LIST request. Plus Lifecycle operations
95
Encryption fees in S3.
No Fees for standard SSE-S3 or SSE-C Pay for encryption when using AWS KMS. DSSE-KMS includes further charges for the second encryption layer
96
Free tier in S3
Gb of storage. 20000 GET;2000 PUT,COPY, POST or LIST; 100 GB of data Transfer each month
97
Well architected best practices, Security Pillar for S3
Enfore Encryption at Rest Enfore Access Control
98
Well architected best practices, Performance Efficiencyr for S3
Learn About and understand availavble cloud servicies and features Factor cost into architectual decisions
99
Well architected best practices, Cost Optimization for S3
Perform cost analysis for different usage over time
100
Reliability
Select the appropriate location and multi-lication devlopment if appropriate