DynamoDB Flashcards

Question

What is a DDB stream?

Answer 1

An ordered list of item-level modifications in a table

Answer 2

- Send to Kinesis Data Streams - Read with Lambda - Read with Kinesis Client Library applications

Answer 3

- React to changes in real-time - Analytics - Insert into derivative tables - Insert into ElasticSearch - Implement cross region replication

Answer 4

KEYS_ONLY - key attributes of the item NEW_IMAGE - entire item after modification OLD_IMAGE - entire item pre-modification NEW_AND_OLD_IMAGES

Answer 5

Records are not retroactively populated in a stream after enabling it.

Answer 6

- Doesn't consume WCUs - Expired items are deleted within 48 hours of expiration (meaning that expired items may appear in reads unless explictly filtered out)

Answer 7

- Reduced stored data by keeping only current items - Adhere to regulatory obligations

Answer 8

--projection-expression

Answer 9

--filter-expression

Answer 10

--page-size: retrieve the full list of items but with a larger number of API calls instead of one API call (page size indicates max number of items per API call) --max-items: max number of items to show in the CLI (returns a NextToken) --starting-token: specify the last NextToken to retrieve the next set of items

Answer 11

Coordinated, all or nothing operations to multiple items across one or more tables - If one read/write fails, all operations are rolled-back Consumes 2x WCUs and RCUs (DDB performs 2 operations for each item - prepare and commit)

Answer 12

TransactGetItems - one or more GetItem operations TransactWriteItems - one or more PutItem, UpdateItem, DeleteItem operations

Answer 13

- ElastiCache and DDB are key/value stores; ElastiCache is in-memory, DDB is serverless (and so offers options such as auto-scaling) - EFS must be attached to EC2 instances as a network drive, cannot be used with Lambda (for example); EFS is a file system vs. DDB as a database - EBS & Instance store can only be used for local caching, not shared caching - S3 is higher latency, not meant for small objects

Answer 14

Add a suffix to the partition key value - Random or calculated suffix This will distribute items evenly across partitions

Answer 15

Concurrent writes are when two (or more writes) occur on an object at once, overwriting each other - Use conditional writes (update value = 1 only if value equals 0; known as Optimistic Locking) - Use atomic writes (increase the value by 1)

Answer 16

Can't store objects larger than 400kb in DDB - Store in S3, then capture the S3 metadata (including url) in DDB

Answer 17

Upload to S3 - Invoke Lambda function to store the metadata of the object within DDB - Query DDB for the specific information around the object without having to access S3

Answer 18

Drop Table and Recreate - Fast, efficient and cheap Scan & DeleteItem - Very slow, consumes RCU and WCU, expensive

Answer 19

- Use AWS Data Pipeline (this will launch an EMR cluster to read the table and write it into S3, and then read from S3 and write to a new table) - Backup and restore into a new table - Scan & PutItem or BatchWriteItem (can write own code to do some transformations as well)

Answer 20

- VPC endpoints available to access DDB without using the internet - Access fully controlled by IAM - Encryption at rest using AWS KMS and in-transit using SSL/TLS

Answer 21

For multi-region, multi-active, fully replicated, high performance tables.

Answer 22

AWS Database Migration Service (DMS)

Answer 23

Do not give the user IAM access to the table - Use an identity provider (cognito, google, SAML etc.) to provide temporary credentials and a restricted IAM role (restricted to only the data they own - LeadingKeys: cognito user - limit row level access for users based on the Primary Key, and limit to specific attributes)

Answer 24

Parallel scans - Default behaviour is sequential, as a scan operation can only read from one partition at a time - To address these issues, the Scan operation can logically divide a table or secondary index into multiple segments, with multiple application workers scanning the segments in parallel

DynamoDB Flashcards

(49 cards)