Amazon DynamoDB | Amazon DynamoDB Accelerator (DAX) Flashcards

1
Q

If I query for an expired item, does it use up my read capacity?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

Yes. This behavior is the same as when you query for an item that does not exist in the table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is DynamoDB Accelerator (DAX)?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for DynamoDB that enables you to benefit from fast in-memory performance for demanding applications. DAX improves the performance of read-intensive DynamoDB workloads so repeat reads of cached data can be served immediately with extremely low latency, without needing to be re-queried from DynamoDB. DAX will automatically retrieve data from DynamoDB tables upon a cache miss. Writes are designated as write-through (data is written to DynamoDB first and then updated in the DAX cache).

Just like DynamoDB, DAX is fault-tolerant and scalable. A DAX cluster has a primary node and zero or more read-replica nodes. Upon a failure for a primary node, DAX will automatically fail over and elect a new primary. For scaling, you may add or remove read replicas.

To get started, create a DAX cluster, download the DAX SDK for Java or Node.js (compatible with the DynamoDB APIs), re-build your application to use the DAX client as opposed to the DynamoDB client, and finally point the DAX client to the DAX cluster endpoint. You do not need to implement any additional caching logic into your application as DAX client implements the same API calls as DynamoDB.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does “DynamoDB-compatible” mean?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

It means that most of the code, applications, and tools you already use today with DynamoDB can be used with DAX with little or no change. The DAX engine is designed to support the DynamoDB APIs for reading and modifying data in DynamoDB. Operations for table management such as CreateTable/DescribeTable/UpdateTable/DeleteTable are not supported.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is in-memory caching, and how does it help my application?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

Caching improves application performance by storing critical pieces of data in memory for low-latency and high throughput access. In the case of DAX, the results of DynamoDB operations are cached. When an application requests data that is stored in the cache, DAX can serve that data immediately without needing to run a query against the regular DynamoDB tables. Data is aged or evicted from DAX by specifying a Time-to-Live (TTL) value for the data or, once all available memory is exhausted, items will be evicted based on the Least Recently Used (LRU) algorithm.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the consistency model of DAX?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

When reading data from DAX, users can specify whether they want the read to be eventually consistent or strongly consistent:

Eventually Consistent Reads (Default) – the eventual consistency option maximizes your read throughput and minimizes latency. On a cache hit, the DAX client will return the result directly from the cache. On a cache miss, DAX will query DynamoDB, update the cache, and return the result set. It should be noted that an eventually consistent read might not reflect the results of a recently completed write. If your application requires full consistency, then we suggest using strongly consistent reads.

Strongly Consistent Reads — in addition to eventual consistency, DAX also gives you the flexibility and control to request a strongly consistent read if your application, or an element of your application, requires it. A strongly consistent read is pass-through for DAX, does not cache the results in DAX, and returns a result that reflects all writes that received a successful response in DynamoDB prior to the read.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the common use cases for DAX?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

DAX has a number of use cases that are not mutually exclusive:

Applications that require the fastest possible response times for reads. Some examples include real-time bidding, social gaming, and trading applications. DAX delivers fast, in-memory read performance for these use cases.

Applications that read a small number of items more frequently than others. For example, consider an e-commerce system that has a one-day sale on a popular product. During the sale, demand for that product (and its data in DynamoDB) would sharply increase, compared to all of the other products. To mitigate the impacts of a “hot” key and a non-uniform data distribution, you could offload the read activity to a DAX cache until the one-day sale is over.

Applications that are read-intensive, but are also cost-sensitive. With DynamoDB, you provision the number of reads per second that your application requires. If read activity increases, you can increase your table’s provisioned read throughput (at an additional cost). Alternatively, you can offload the activity from your application to a DAX cluster, and reduce the amount of read capacity units you’d need to purchase otherwise.

Applications that require repeated reads against a large set of data. Such an application could potentially divert database resources from other applications. For example, a long-running analysis of regional weather data could temporarily consume all of the read capacity in a DynamoDB table, which would negatively impact other applications that need to access the same data. With DAX, the weather analysis could be performed against cached data instead.

How It Works

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does DAX manage on my behalf?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

DAX is a fully-managed cache for DynamoDB. It manages the work involved in setting up dedicated caching nodes, from provisioning the server resources to installing the DAX software. Once your DAX cache cluster is set up and running, the service automates common administrative tasks such as failure detection and recovery, and software patching. DAX provides detailed CloudWatch monitoring metrics associated with your cluster, enabling you to diagnose and react to issues quickly. Using these metrics, you can set up thresholds to receive CloudWatch alarms. DAX handles all of the data caching, retrieval, and eviction so your application does not have to. You can simply use the DynamoDB API to write and retrieve data, and DAX handles all of the caching logic behind the scenes to deliver improved performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What kinds of data does DAX cache?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

All read API calls will be cached by DAX, with strongly consistent requests being read directly from DynamoDB, while eventually consistent reads will be read from DAX if the item is available. Write API calls are write-through (synchronous write to DynamoDB which is updated in the cache upon a successful write).

The following API calls will result in examining the cache. Upon a hit, the item will be returned. Upon a miss, the request will pass through, and upon a successful retrieval the item will be cached and returned.

  • GetItem
  • BatchGetItem
  • Query
  • Scan

The following API calls are write-through operations.

  • BatchWriteItem
  • UpdateItem
  • DeleteItem
  • PutItem
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does DAX handle data eviction?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

DAX handles cache eviction in three different ways. First, it uses a Time-to-Live (TTL) value that denotes the absolute period of time that an item is available in the cache. Second, when the cache is full, a DAX cluster uses a Least Recently Used (LRU) algorithm to decide which items to evict. Third, with the write-through functionality, DAX evicts older values as new values are written through DAX. This helps keep the DAX item cache consistent with the underlying data store using a single API call.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Does DAX work with DynamoDB GSIs and LSIs?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

Just like DynamoDB tables, DAX will cache the result sets from both query and scan operations against both DynamoDB GSIs and LSIs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does DAX handle Query and Scan result sets?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

Within a DAX cluster, there are two different caches: 1) item cache and 2) query cache. The item cache manages GetItem, PutItem, and DeleteItem requests for individual key-value pairs. The query cache manages the result sets from Scan and Query requests. In this regard, the Scan/Query text is the “key” and the result set is the “value”. While both the item cache and the query cache are managed in the same cluster (and you can specify different TTL values for each cache), they do not overlap. For example, a scan of a table does not populate the item cache, but instead records an entry in the query cache that stores the result set of the scan.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Does an update to the item cache either update or invalidate result sets in my query cache?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

No. The best way to mitigate inconsistencies between result sets in the item cache and query cache is to set the TTL for the query cache to be of an acceptable period of time for which your application can handle such inconsistencies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Can I connect to my DAX cluster from outside of my VPC?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

The only way to connect to your DAX cluster from outside of your VPC is through a VPN connection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When using DAX, what happens if my underlying DynamoDB tables are throttled?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

If DAX is either reading or writing to a DynamoDB table and receives a throttling exception, DAX will return the exception back to the DAX client. Further, the DAX service does not attempt server-side retries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Does DAX support pre-warming of the cache?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

DAX utilizes lazy-loading to populate the cache. What this means is that on the first read of an item, DAX will fetch the item from DynamoDB and then populate the cache. While DAX does not support cache pre-warming as a feature, the DAX cache can be pre-warmed for an application by running an external script/application that reads the desired data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How does DAX work with the DynamoDB TTL feature?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

Both DynamoDB and DAX have the concept of a “TTL” (or Time to Live) feature. In the context of DynamoDB, TTL is a feature that enables customers to age out their data by tagging the data with a particular attribute and corresponding timestamp. For example, if customers wanted data to be deleted after the data has aged for one month, they would use the DynamoDB TTL feature to accomplish this task as opposed to managing the aging workflow themselves.

In the context of DAX, TTL specifies the duration of time in which an item in cache is valid. For instance, if a TTL is set for 5-minutes, once an item has been populated in cache it will continue to be valid and served from the cache until the 5-minute period has elapsed. Although not central to this conversation, TTL can be preempted by writes to the cache for the same item or if there is memory pressure on the DAX node and LRU evicts the items as it was the least recently used.

While TTL for DynamoDB and DAX will typically be operating in very different time scales (i.e., DAX TTL operating in the scope of minutes/hours and DynamoDB TTL operating in the scope of weeks/months/years), there is a potential when customers will need to be present of how these two features affect each other. For example, let’s imagine a scenario in which the TTL value for DynamoDB is less than the TTL value for DAX. In this scenario, an item could conceivably be cached in DAX and subsequently deleted from DynamoDB via the DynamoDB TTL feature. The result would be an inconsistent cache. While we don’t expect this scenario to happen often as the time scales for the two features are typically order of magnitude apart, it is good to be aware of how the two features relate to each other.

17
Q

Does DAX support cross-region replication?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

Currently DAX only supports DynamoDB tables in the same AWS region as the DAX cluster.

18
Q

Is DAX supported as a resource type in AWS CloudFormation?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

Yes. You can create, update and delete DAX clusters, parameter groups, and subnet groups using AWS CloudFormation.

Getting Started

19
Q

How do I get started with DAX?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

You can create a new DAX cluster through the AWS console or AWS SDK to obtain the DAX cluster endpoint. A DAX-compatible client will need to be downloaded and used in the application with the new DAX endpoint.

20
Q

How do I create a DAX cluster?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

You can create a DAX cluster using the AWS Management Console or the DAX CLI. DAX clusters range from a 13 GiB cache (dax.r3.large) to 216 GiB (dax.r3.8xlarge) in the R3 instance types, 15.25GiB cache (dax.r4.large) to 488 GiB (dax.r4.16xlarge) in the R4 instance types, and 2 GiB (dax.t2.small) to 4 GiB (data.t2.medium) for the smaller T2 instance types. With a few clicks in the console or a single API call, you can add up to 10 replicas to your cluster for increased throughput.

The single node configuration enables you to get started with DAX quickly and cost-effectively, and then scale out to a multi-node configuration as your needs grow. The multi-node configuration consists of a primary node that manages writes, and up to nine read replica nodes. The primary node is provisioned for you automatically.

Specify your preferred subnet groups and Availability Zones (optional), the number of nodes, node types, VPC subnet group, and other system settings. After you’ve chosen your desired configuration, DAX will provision the required resources and set up your caching cluster specifically for DynamoDB.

21
Q

Does all my data need to fit in memory to use DAX?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

No. DAX will utilize the available memory on the node. Using either TTL and/or LRU, items will be expunged to make space for new data when the memory space is exhausted.

22
Q

Which languages does DAX support?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

DAX provides DAX SDKs for Java, Node.js, Python, and .NET that you can download. We are actively working on adding support for additional client SDKs.

23
Q

Can I use DAX and DynamoDB at the same time?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

Yes, you can access the DAX endpoint and DynamoDB at the same time through different clients. However, DAX will not be able to detect changes in data written directly to DynamoDB unless these changes are explicitly populated in to DAX through a read operation after the update was made directly to DynamoDB.

24
Q

Can I utilize multiple DAX clusters for the same DynamoDB table?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

Yes, you can provision multiple DAX clusters for the same DynamoDB table. These clusters will provide different endpoints that can be used for different use cases, ensuring optimal caching for each scenario. Two DAX clusters will be independent of each other and will not share state or updates, so users are best served using these for completely different tables.

25
Q

How will I know what DAX node type I’ll need for my workload?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

Sizing of a DAX cluster is an iterative process. It is recommended to provision a three-node cluster (for high availability) with enough memory to fit the application’s working set in memory. Based on the performance and throughput of the application, the utilization of the DAX cluster, and the cache hit/miss ratio you may need to scale your DAX cluster to achieve desired results.

26
Q

On what kinds of Amazon EC2 instances can DAX run?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

See the Amazon DynamoDB Pricing page for the latest instance types supported by DAX.

27
Q

Does DAX support Reserved Instances or the AWS Free Usage Tier?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

Currently DAX only supports on-demand instances.

28
Q

How is DAX priced?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

DAX is priced per node-hour consumed, from the time a node is launched until it is terminated. Each partial node-hour consumed will be billed as a full hour. Pricing applies to all individual nodes in the DAX cluster. For example, if you have a three node DAX cluster, you will be billed for each of the separate nodes (three nodes in total) on an hourly basis.

Availability

29
Q

How can I achieve high availability with my DAX cluster?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

DAX provides built-in multi-AZ support, letting you choose the preferred availability zones for the nodes in your DAX cluster. DAX uses asynchronous replication to provide consistency between the nodes, so that in the event of a failure, there will be additional nodes that can service requests. To achieve high availability for your DAX cluster, for both planned and unplanned outages, we recommend that you deploy at least three nodes in three separate availability zones. Each AZ runs on its own physically distinct, independent infrastructure, and is engineered to be highly reliable.

30
Q

What happens if a DAX node fails?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

If the primary node fails, DAX automatically detects the failure, selects one of the available read replicas, and promotes it to become the new primary. In addition, DAX provisions a new node in the same availability zone of the failed primary; this new node replaces the newly-promoted read replica. If the primary fails due to a temporary availability zone disruption, the new replica will be launched as soon as the AZ has recovered. If a single-node cluster fails, DAX launches a new node in the same availability zone.

Scalability

31
Q

What type of scaling does DAX support?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

DAX supports two scaling options today. The first option is read scaling to gain additional throughput by adding read replicas to a cluster. A single DAX cluster supports up to 10 nodes, offering millions of requests per second. Adding or removing additional replicas is an online operation. The second way to scale a cluster is to scale up or down by selecting larger or smaller r3 instance types. Larger nodes will enable the cluster to store more of the application’s data set in memory and thus reduce cache misses and improve overall performance of the application. When creating a DAX cluster, all nodes in the cluster must be of the same instance type. Additionally, if you desire to change the instance type for your DAX cluster (i.e., scale up from r3.large to r3.2xlarge), you must create a new DAX cluster with the desired instance type. DAX does not currently support online scale-up or scale-down operations.

32
Q

How do I write-scale my application?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

Within a DAX cluster, only the primary node handles write operations to DynamoDB. Thus, adding more nodes to the DAX cluster will increase the read throughput, but not the write throughput. To increase write throughput for your application, you will need to either scale-up to a larger instance size or provision multiple DAX clusters and shard your key-space in the application layer.

Monitoring

33
Q

How do I monitor the performance of my DAX cluster?

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB | Database

A

Metrics for CPU utilization, cache hit/miss counts and read/write traffic to your DAX cluster are available via the AWS Management Console or Amazon CloudWatch APIs. You can also add additional, user-defined metrics via Amazon CloudWatch’s custom metric functionality. In addition to CloudWatch metrics, DAX also provides information on cache hit, miss, query and cluster performance via the AWS Management Console.

Maintenance