AWS Data Storage Flashcards

1
Q

AWS Data Sync

A
  • Fast and simple way to sync existing FileSystems into
    • AWS EFS
    • AWS S3
    • AWS FSx Windows Server
  • Over Inet or Direct Connect
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

AWS Athena

A
  • Queries data in S3 using SQL

- Can be connected to other data sources using Lambda

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

AWS Glue

A
  • Fully Managed ETL Service

- Used to prepare data for analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Transfer Acceleration

A
  • Used for S3
  • Leveraged AWS Cloudfront edge locations
  • Delivers fast, easy, and secure transfer of files
  • Used to accelerate xfers over long distances
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

S3

A
  • Object Storage with Unlimited space
  • Universal Namespace
  • Created in a Region
  • Files from 0 up to 5 TBs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

S3 as a Static Website

A
  • HTTP connections only
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Kinesis Data Firehose

A
  • Captures, Transforms, and loads streaming data
  • Data sent to it by producers
  • Data is sent to other AWS Services from it
  • Data can be transformed by Lambda
  • Enables near real time analytics using BI tools
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

FSx File System Options

A
  • Scratch
    • Temp Storage, data not replicated, high burst rate
  • Persistent
    • Long term, data replicated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Storage Gateway - File Gateway

A
  • Virtual On-Prem file server
  • Stores and receives files in S3
  • Used with on-prem apps that need files storage in s3
  • used with ec2 apps that needs file storage in s3
  • SBM or NFS
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

EFS - Elastic File System

A
  • Fully managed NFS file system
  • Mounts in one or many AZs
  • Uses VPN or Direct Connect for On-prem Mounts
  • Data stored across many AZs
  • ## Scales to PBs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

FSx for Lustre

A
  • High performance File System for Fast processing
  • ML, HPC, Video, Financial Models
  • Unix based
  • Natively with S3
  • Objects are presented as files in the FS
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Kinesis Data Analytics

A
  • Real time sql processing for Streaming Data from
    • Kinesis Data Streams
    • Kinesis Fire Hose
  • Sends to
  • Kinesis Data Streams
  • Kinesis Fire Hose
  • Lambda
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Kinesis Data Streams

A
  • Real time processing of streaming data
  • Rapidly moves data off producers
  • Stores for 24hrs or up to 7 days
  • Stores data for latter processing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Amazon EMR

A
  • Managed service for Hadoop or Spark
  • Commonly used for log analysis, financial analysis
  • Used to ETL data for big data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data Lifecycle Manager

A
  • Automates the creation, retention, and deletions of EBS
  • Snapshots and Volumes
  • enforces regular backup schedule
  • creates standardized AMIs
  • retains backups for audit and compliance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Storage Gateway - Volume Gateway

A
  • virtual appliance for block based storage
  • Cached mode - Stored on S3 with cache of frequent data on site
  • Stored mode - Stored on Site with async backup to S3
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Storage Gateway

A
  • Virtual appliance / machine on prem
  • Enables hybrid storage between onprem and aws
  • low latency with data cached on prem
  • data stored securely and durably in AWS
  • Local storage backed by S3 and Glacier
  • Cloud migrations and DR prep
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Storage Gateway - Tape Gateway

A
  • Virtual appliance in support of Tape/VTL storage

- Netbackup, Backup Exe, Veeam

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

S3 Multi-part Upload

A
  • Files over 100mb
  • Can be used for files 5mb to 5tb
  • improves throughput
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

S3 buckets

A
  • private by default
  • object ACLs make individual objects public
  • Bucket Policies make the entire bucket public
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

S3 - Encryption - SSE

A
  • Service Side Encryption
  • AWS provides for us on our behalf.
  • AWS manages
  • Keys in AWS
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

S3 - Encryption - SSE - KMS

A
  • Server side encryption with KMS
  • Uses KMS system
  • Keys are stored in AWS
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

S3 - Encryption - SSE - C

A
  • Server side encryption - client managed
  • Customer handles the keys
  • Not stored in AWS
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

S3 - Encryption - Client side

A
  • Encryption before uploading the object
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
S3 Performance
- Use multi-part on upload - multiple threads - Use Folders - multiple consumers - Use Byte Range - Split the file
26
S3 Replication
- Delete markers are not replicated by default - existing objects are not replicated - versioning must be enabled
27
Snowball
- 80TBs
28
Snowball Edge
- 100TBs with compute
29
Snowmobile
100 pbs
30
Can Fargate connect to FSx for Luster?
NO... must use EFS in this case.
31
Can a linux instance connect to FSx for Windows File Server
Yes... using the cifs-utils package linux can mount an SMB/CIFS share
32
RAID used for I/O ... allows for increased IOPs
RAID 0 - Stripped
33
RAID used for durability
RAID 1 - Mirrored
34
What is the magic number to use io1 vs gp2
16000 IOPS
35
What NAS works with MS Active Directory
FXs Window File Server
36
Does FSx support multi-AZ
Yes
37
EFS Storage Classes
Standard - multiple AZs Standard IA OneZone - Redundant in a single AZ - 47% less OneZone - IA
38
EFS LifeCycle parameter used
age-off policy for files
39
Used to move files automatically between EFS storage classes based on access
EFS Intelligent-Tiering
40
How can you restrict direct access to an s3 bucket so that only a website can access the data.
Bucket policy and allowing referrals from the website url... Should not hardcode the IPS of the EC2 instances running the website
41
How can you store a backup of an EBS volume on S3
Take a snap... snaps are stored on S3
42
Encryption is supported on all ebs volume types? True or false
True
43
Can you have both encrypted and non-encrypted volumes on an instance
Yes
44
S3 object lock mode where users can't overwrite or delete an object version or alter its lock settings unless they have special permissions.
Governance mode
45
S3 object lock mode where a protected object version can't be overwritten or deleted by any user, including the root user in your AWS account.
Compliance mode
46
allows you to easily deploy and enforce compliance controls for individual S3 Glacier vaults with a vault lock policy.
S3 Glacier vault lock
47
S3 Object Lock is enabled when?
the s3 object locks can be only enabled on bucket creation
48
Can you use S3 Object Lock and lifecycle policies?
Yes…. S3 Object Lock protection is maintained regardless of which storage class the object resides in and throughout S3 Lifecycle transitions between storage classes.
49
S3 Glacier Deep Archive minimum storage duration period
180 days
50
S3 Glacier Flexible Retrieval minimum storage duration period
90 days
51
S3 Transitions
S3 Standard storage class –> other storage class. Any storage class –> S3 Glacier or S3 Glacier Deep Archive storage classes. S3 Standard-IA –> S3 Intelligent-Tiering or S3 One Zone-IA S3 Intelligent-Tiering storage class –> S3 One Zone-IA storage class. S3 Glacier storage class –> S3 Glacier Deep Archive storage class.
52
Object size that requires multi-part upload?
5gb
53
S3 Object size multi-part upload recommended
100mb
54
Can you use S3 Object lock with Glacier?
Yes. S3 Object Lock is a new feature that prevents data from being deleted during a customer-defined retention period. You can use Object Lock with any S3 storage class, including S3 Glacier.
55
What can you use for added security for EFS connections?
Add a rule to the mount target security group to allow inbound access from the EC2 security group
56
Uses simple SQL expressions to query S3 for analysis
Amazon S3 Select - Select is a lightweight solution designed to let you use SQL to perform simple SELECT clauses on a maximum of one file. Amazon Athena is an analytics workhorse that allows you to perform SQL on extremely large datasets spanning many files with great performance
57
Can S3 Transfer Acceleration be used for Downloads (GETS)
No... its for Uploads
58
How to restrict S3 access to folders by folder name
Create IAM policies for folder level permissions | Create Groups and attach the policeis
59
a storage solution that can scale as data volumes increase with the LEAST amount of management and configuration for EC2 instances...
EFS... EBS isn't a managed service.
60
S3 is strongly consistent for all get, our, and list ops
True
61
EFS mode for high frequency reads and writes
Provisioned throughput mide
62
EFS mode recommend for most applications
Bursting throughput mode
63
EFS mode that scales throughput based on spikes
Bursting
64
Which s3 supports encryption by default for both data at rest and in flight
S3 Glacier
65
allow you to quickly access your data when occasional urgent requests for a subset of archives are required. For all but the largest archives (250 MB+), data accessed using Expedited retrievals are typically made available within 1–5 minutes. Provisioned Capacity ensures that retrieval capacity for Expedited retrievals is available when you need it.
Glacier Expedited retrievals
66
allow you to access any of your archives within several hours…..typically complete within 3–5 hours. This is the default option for retrieval requests that do not specify the retrieval option.
Glacier Standard retrievals
67
are S3 Glacier’s lowest-cost retrieval option, which you can use to retrieve large amounts, even petabytes, of data inexpensively in a day typically complete within 5–12 hours.
Glacier Bulk retrievals
68
Can an efs be accessed from another region
Yes via inter-region vpc peering
69
Io2 block express up to ____ iops
256k
70
Are there transfer charges for s3 from the inet
No
71
S3 standard has what min duration charge
None
72
EFS performance mode
Max I/O
73
S3 glacier flexible retrieval times
Mins to hours
74
S3 glacier instant retrieval times
Milliseconds
75
S3 glacier deep archive retrieval times
Hours