Architecting to scale Flashcards

1
Q

What is a loosely coupled architecture?

A

Where components can stand independently and require little or no knowledge of the inner workings of the other components

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why use a loosely coupled architecture for scalability?

4 points

A

1) provides abstraction
2) Interchangeable components
3) More atomic functional units
4) you can scale components independently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is horizontal scaling? (4 points)

A

1) Where you add instances as demand increases
2) no downtime required to scale up or down
3) You can do this automatically using auto-scaling groups
4) theoretically unlimited

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is vertical scaling (4 points)?

A

1) Where you add more CPI and or RAM to an existing instance as demand increases
2) Requires restart to scale up or down
3) Would require scripting to automate
4) limited by instance size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define scale out…

A

Where you add another instance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define scale up…

A

Where you increase resources of an instance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define scale in…

A

Where you remote an instance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define scale down…

A

Where you decrease the resources of an instance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why would you scale out over scaling up?

A

Because demand is never constant! So you will be wasting resources when scaling up…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the key benefit of scaling out?

A

Cost savings!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 2 types of autoscaling offered by AWS?

A

1) EC2 autoscaling

2) Application autoscaling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is AWS auto scaling? and why would you use it?

A

What- Provides a centralised way to manage scalability for whole stacks and can provide predictive scaling.

why- Gives you the ability to manage EC2 and application autoscaling from a unified standpoint

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the 4 scaling options with EC2 autoscaling groups/types?

A

1) Maintaining- Keep a specific or min number of instances running
2) Manual- use max and min or specified number of instances
3) schedule- increase or decrease instances based on a schedule
4) Dynamic scale based on real-time metrics of the system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a launch configuration? and what 7 things do you include in this?

A

A launch configuration is an instance configuration template that an Auto Scaling group uses to launch EC2 instances. When you create a launch configuration, you specify information for the instances.

1) Include the ID of the Amazon Machine Image (AMI)
2) the instance type
3) a key pair
4) one or more security groups
5) a block device mapping
6) define a health check grace period
7) the scale type (how we want to scale)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a health check grace period?

A

A time period that the scaling policy will allow to let that system to spin up before checking the health of that service.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which use case would be the most appropriate for a maintain scaling type?

A

When you always need X number of instances always

e.g. 3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Which use case would be the most appropriate for a manual scaling type?

A

My needs change so rarely that I can just manually add and remove instances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Which use case would be the most appropriate for a scheduled scaling type?

A

Every Monday morning we get a rush on our website

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Which use case would be the most appropriate for a dynamic scaling type?

A

When CPU utilisation gets to 70% on current instances, scale up

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Within the dynamic scaling type, we have EC2 autoscaling policies. Name and describe the 3 policies…

A

1) Target tracking policy- Scale based on a pre-defined or customer metric in relation to a target value
2) Simple scaling policy- wait until health checks and cold down period expires before evaluating new need
3) Step scaling policy- Responds to scaling needs with more sophistication and logic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Which use case would be the most appropriate for a target tracking policy…

A

When CPU utilization gets to 70% on current instances, scale up

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Which use case would be the most appropriate for a simple scaling policy…

A

Let’s add new instances slowly and steadily

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Which use case would be the most appropriate for a step scaling policy…

A

AGG add all the instances!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is a scaling cooldown?

A

Configurable duration that gives your scaling a chance to “come up to speed” and absorb load.

Different than a health check!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
How long is the default cooldown period?
300 seconds
26
Which scaling policy does cooldown period get applied to by default?
Dynamic scaling
27
Can you override the default cooldown period?
Yes
28
What is the benefit of a cool down period?
Sanity check to see if adding the resource was enough to absorb the load.
29
Name 3 types of scaling policies available with application autoscaling...
1) Target tracking policy- initiates scaling to try and track as closely as possible to a given metric 2) Step scaling policy- Based on a metric it adjusts capacity to a given defined threshold 3) Scheduled scaling policy- Initiated scaling events based on a pre-defined time, day or date
30
Which use case would be the most appropriate for a target tracking policy...
I want my ECS (container) hosts to stay at or below 70% CPU utilization
31
Which use case would be the most appropriate for a step scaling policy...
I want to increase my EC2 spot fleet by 20% everytime I add another 10,000 connections on my ELB
32
Which use case would be the most appropriate for a scheduled scaling policy...
Every Moday at 08:00 I want to increase the read capacity units of my DynmoDB table to 20,000
33
What is a shard?
Shard is the base throughput unit of an Amazon Kinesis data stream. One shard provides a capacity of 1MB/sec data input and 2MB/sec data output.
34
What are the 3 parts of a shard?
1) partition key 2) sequence (order of the shard in a sequence) 3) data
35
What are the two dimensions of DynamoDB scaling?
1) Throughput- Read capacity units and write capacity units | 2) Size- Max size is 400KB but it can scale as you can store as many as you like
36
What is a partition in DynamoDB?
A physical space where DynamoDB is stored
37
What is a partition key in DynamoDB?
A unique identifier for each record sometimes called a hash key
38
What is a sort key in DynamoDB?
An optional key that defines storage order on the partition
39
How does dynamoDB scale out?
DynamoDB adds additional partitions to scale out
40
How do you work out the number of partitions you get?
You work out how many partitions you need by capacity (how many RCU and WRU you have provisioned!) and the size. Then take the MAX of the largest dimension and round up to get the total number of partitions
41
What is the formula for calculating the partition size of your DynamoDB table by capacity?
(total RCU/3000) + (total WCU/1000)
42
What is the formula for calculating the partition size of your DynamoDB table by size?
Total size in GB/10GB
43
How are the read and write capacity allocated across partitions?
Splits equally across partitions
44
What happens when if we increase our RCU/WCU or we reach our 10GB size limit?
The data is divided down the middle and creates another partition based on the partition key hash. This will keep happening to scale out.
45
What is a hotkey issue? provide an example...
When read and writes are concentrated in the same partition. For example, if you used a date as a partition key and store lots of different data under the same date, when querying this data it will be accessing the same partition over and over...
46
How do you avoid a hotkey issue?
Choose a different variable for a partition key. e.g. one that is not date and is by sensor type for example and use date as a sort key
47
What is the issue with using a target tracking method to scale a DynamoDB table?
It will not scale down, there are some work around like sending dummy requests at reducing frequency or reducing the max capacity to equal the min capacity
48
What is a global secondary index?
Like a copy of the table
49
What is an alternative to autoscaling in DynamoDB?
Using a 'on-demand' setting for DynamoDB, costs more! but is useful when you are not sure if an app will be super popular
50
What is DynamoDB Accelerator (DAX)?
An in-memory cache that sits in front of your table
51
What is a good use case for DAX? (3 points)
When you require the fastest possible reads from a database, such as live auctions or securities trading or read intense scenarios where you want to offload the reads from DynamoDB Repeated reads against a large set of dynamoDB data
52
What are bad use cases for DAX? (2 points)
1) write intense applications that don't have many reads | 2) Applications where you use client caching methods
53
What types of data content can you cache at edge locations using CloudFront?
Static and dynamic content
54
How is dynamic content delivered using CloudFront?
Delievered using HTTP cookies forwarded from your origin
55
What protocol do you use for media streaming and live media streaming?
HTTP and HTTPS
56
Which services can be used as origins to CloudFront?
S3, EC2, ELB or another webserver
57
How can a behaviour be used to configure serving content via CloudFront?
You can use behaviours to configure serving up origin content based on URL paths. This will route user to different content origins based on a URL path e.g. wp-content/* static content wp-admin/ directs to ELB...
58
What is an invalidation request?
A way of invalidating a CloudFront cache
59
What are the 4 methods you can use to invalidate content from a CloudFront cache?
1) simply delete the file from the origin and wait for the TTL (time to live) to expire 2) Use the AWS console to request invalidation for all content or a specific path such as /images/* 3) Use the CloudFront API to submit an invalidation request 4) Use a 3rd party tool to perform a CloudFront invalidation e.g. cloudberry, ylastic....
60
What is a Zone apex? and does CloudFront support it?
Yes A domain without a www. or subdomain in front
61
Can you add geo-restrictions in CloudFront?
Yes, you can whitelist (show) or blacklist (block) content based on location
62
What is SNS?
Simple Notification Service. Scalable hosted a queuing service. Is integrated with KMS for encryptedmessaging
63
What is the data storage type and how long is data persisted for?
Transient. 4 days default, max 14 days
64
What is the max size of messages in SQS?
256KB or 2GB using the SDK
65
What are the key benefits of using SQS?
Allows the creation of a loosely coupled architecture
66
What is meant by standard and FIFO queuing?
Standard- No assurances that a message will enter and leave the queue based on the order they arrived FIFO- Will maintain the order of the queue
67
What is the risk with Standard queueing?
There is a risk that order will be lost for the process
68
What is the risk with FIFO queueing?
If a message fails it will hold all the other messages behind it- causes delay or latency
69
What is Amazon MQ?
A implementation of ApacheMQ. A message broker. Usually used to replace on-prem message broker.
70
What is a lambda fan-out model?
Where a lambda function call sets of multiple lambda calls in parallel
71
What is the AWS serverless application model (SAM)?
An opensource framework for building a serverless app on AWS
72
Which language does AWS serverless application model use as it's configuration language?
YAML
73
What are the 3 steps of an AWS SAM workflow?
1) create your YAML file 2) convert this to a CloudFormation 3) Creates AWS infrastructure
74
What are the 3 key features of AWS SAM?
1) uses YAML for templates 2) Purpose built to help make developing serverless apps as efficient as possible 3) Generates CloudFormation scripts
75
What are the 4 key features of the Serverless Framework?
1) uses YAML for templates 2) Purpose built to help make developing and deploying serverless apps 3) Generats Cloud formation scripts 4) Supports many other cloud providers such as Azure...
76
What is Amazon EventBridge?
Designed to link a variety of AWS and 3rd party apps | e.g. integrate ZenDesk with your application
77
What is simple workflow service?
Creates a distributed asynchronous system workflows. It support sequential as well as parallel workflows. Activity worker and a Decider worker
78
What is SWF best suited for?
Best suited for human-enabled workflows like order fulfilment or procedure requests
79
Would AWS recommend SWF or step function?
Step function
80
What is an AWS step function?
A way to manage workflows. An orchestration platform. You define you app as a state machine. Each object can assume a different state throughout a process. Creates tasks, sequential steps, parallel steps etc...
81
What language do you use to define step functions?
JSON
82
What is AWS batch?
A management tool for creating and executing batch orientated tasks using EC2 instances
83
What are the 4 steps to running a batch using AWS batch?
1) Create a compute environment 2) Create a job queue with the priority assigned to a compute environment 3) Create a job description, script or JSON, env vars, IAM roles e.t.c. 4) Schedule the job
84
When would you use a step function and provide a use case?
out of the box coordination of an AWS service component use case- order processing flows
85
When would you use a simple workflow service? and provide a use case...
When you need to support external processes or specialised execution logic use case- loan application process with manual review steps
86
When would you use a simple queue service? and provide a use case...
Messaging queue store and forward patterns use case- image resize process
87
When would you use a AWS batch? and provide a use case...
Scheduled or re-occurring tasks that do not require heavy logic use case- Rotate logs daily on firewall appliance
88
What is Elastic MapReduce?
Designed for big data processing and analysis. It is comprised of a hadoop framework. It is a collection of services to process large data sets. "The Zoo"
89
What is hadoop map reduce?
A tool used for distributed processing
90
What is Hadoop HDFS?
A Hadoop distributed file system. A persistent data store
91
What is Zookeeper?
A tool to ensure resources are coordinated in a hadoop framework
92
What oozie?
Hadoop workflow framework
93
What is pig?
A hadoop scripting framework
94
What is Hive?
A SQL interface into a hadoop landascape
95
What is Mahout?
A machine learning component in the hadoop landscape
96
What is HBase?
A columnar database for storing hadoop data
97
What is Flume?
A log collection system for a hadoop landscape
98
What is Sqoop?
Facilitates input of data from other data stores into a hadoop landscape
99
What is Ambari?
A tool used to manage and monitor a hadoop landscape
100
What is meant by the term green field?
When an application/software is developed from scrap
101
What is meant by the term brownfield?
When an application/software is developed or built from an existing program?
102
What 3 pieces of information do you need to determine the number of partitions in a DynamoDB?
1) size of the table 2) Number of RCUs 3) Number of WRUs
103
How can you make scaling more dramatic and responsive?
Reduce the cooldown time to allow scaling to be more dramatic and responsive
104
Which Kinesis service can stream into S3?
Kinesis Firehose
105
What are the 2 main uses from Kinesis data streams?
1) They can enable real-time reporting and analysis of streamed data 2) They can accept data as soon as it has been produced with out the need for batching
106
What is the most cost effective way to scale based on sometimes getting spikes on a Monday morning?
Dynamic based on a metric like connections or CPU. If using scheduled you would be scaling even when there is no spike.
107
What is the main benefit of a loosley coupled architecture?
More atomic functional units
108
What is the Kinesis Client Library (KCL) used for?
A method of reading data from a shard
109
What is the best practice way for storing time-series data in a DynamoDB?
Use one table per application period. If all time series data in table the last partition would get all the read and write actions General DynamoDB best practice is to keep the number of tables to a minimum.