High Availability Practices Flashcards

Question 1

Q

What can affect availability

Answer

A

System Maintenance, Software Updates, Infrastructure issues, Malicious Attacks, System load and dependencies. Additionally, in the cloud, latency and provider issues.

Question 2

Q

How is availability measured

Answer

A

Availability is typically measured by SLA and using 9s. For example, Five 9s mean 99.999%

Question 3

Q

How do you monitor availability

Answer

A

Create a Health Check Endpoint

Question 4

Q

What should a health check endpoint monitor

Answer

A

Subsystems like storage, databases and third-party dependencies

Question 5

Q

What should a health check endpoint return and should you secure a health check endpoint

Answer

A

Status Code content, yes it should be secure

Question 6

Q

What are some methods that can be employed to ensure high availability

Answer

A

Queues/Streams, Throttling,

Question 7

Q

How can throttling be employed

Answer

A

Set a limit to individual user access, monitor metrics and reject when limit is exceeded

Disable or degrade nonessential services so that critical services can function, for example, a video call can switch to audio only during bandwidth issues

Prioritize certain users to satisfy high impact customers’ requirements

Question 8

Q

How can a queue be employed

Answer

A

Introduce a Queue between the task and service
The tasks are placed in the Queue

The Service can possibly be autoscaled based on Queue Size in some advanced implementations.

If a response is expected, the service must provide a suitable implementation, however, this pattern isn’t suitable for low latency response requirements

Question 9

Q

What are some resiliency patterns

Answer

A

Bulk Head, Circuit Breaker, Compensating Transaction, Retry, Leader Election, Scheduler Agent Supervisor, If on AWS: Multiserver Pattern, MultiDatacenter Pattern, Floating IP

Question 10

Q

What is the bulk head resiliency pattern

Answer

A

Partition services into groups, Limit service resources to that group, Define partitions into business and tech requirements, hiPri customers get more resources, Leverage frameworks like polly/hystrix that limit containers resources

Question 11

Q

What is the circuit breaker resiliency pattern

Answer

A

If a service negatively affects applications if it were to continue to run, it is shut down.

Question 12

Q

What is the compensating transaction resiliency pattern

Answer

A

Records all steps to a workflow and undoes them if there is a failure.

Question 13

Q

What is the retry resiliency pattern

Answer

A

Intelligently attempt to reestablish contact with a failing service

Question 14

Q

What is the leader election resiliency pattern

Answer

A

A single task instance should be elected as leader. This will coordinate the actions with other subordinate instances.

High Availability Practices Flashcards

To enhance knowledge of providing high availability of applications (14 cards)