L5 - Autoscaling 1/2 Flashcards

1
Q

Why an elastic application?

A
  • reduce over/under-provisioning
  • reduce cost + increase customer satisfaction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are 4 typical resources applications use?

A
  • CPU
  • Memory
  • Disk
  • Network
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Dynamism for desktop apps on the laptop

A

seconds, thread scheduling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Dynamism for HPC with a cluster as a shared resource

A

hours, days for job scheduling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Dynamism for banking with mainframe

A

periodically every day for processor allocation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Dynamism for web and server clusters

A

highly dynamic, limited predictability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is increased throughput?

A

Ability to handle more workload (requests) in the same time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is decreased latency?

A

Individual requests are handled faster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Can we normally decrease latency or increase throughput for web-applications?

A

Normally we can only increase throughput

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Is there a scalability limit for throughput?

A

Yes, the curve converges to a certain limit in the long-run

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why is there a scalability limit for throughput?

A
  • overhead with parallelization
  • bottleneck: initiation of parallelization is a sequential process –> at a certain point, the sequential part dominates the execution (Amdahl’s law)
  • shared databases limit the load that can be processed
  • programming influences whether applications can scale
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is scalability of applications?

A

Characteristics of an application to increase its capacity (throughput)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does the capacity of an application depend on?

A
  • available resource capacities
  • application design (whether the app is programmed for scalability)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are scalability limits?

A
  • maximum application capacity
  • throughput can be limited by max resource capacities or application design
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What happens when applications with poor scalability are scaled?

A
  • significant drop in efficiency
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is speedup?

A

performace (p processors) / performance (1 processor)

e.g. for CPU the transactions per second

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

efficiency

A

efficiency (p processors) = speedup (p processors) / p

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Does speedup linearly scale?

A

No. With one processor the efficiency = 1 but then the efficiency drops

19
Q

What is parallel computing?

A

Where many processors work simultaneously to produce exceptional computational power and to significantly reduce the total computational time.

20
Q

What is elasticity?

A
  • dynamic adaptation of the capacity to a change in the workload
  • no shutdown/restart required
  • shrink capacity, if workload decreases
  • increase capacity, if workload increases
21
Q

What is autoscaling?

A

Cloud computing feature that enables organizations to scale cloud services such as server capacities or VMs up or down automatically, based on defined situations such as traffic or utilization levels.

22
Q

What is a backend service?

A

Needed to answer requests that arrive at the frontend

23
Q

What is vertical scaling - scaling up?

A

Scale the server on which the service is running.
You can increase the capacity of a single service instance by increasing its resources:

  • increase CPU time percentage
  • increase clock frequency
  • add more cores
  • replace existing resources with more powerful ones
24
Q

Pros of vertical scaling

A
  • easy to replace a resource with a more powerful one
  • it does not require a re-design of the application
25
Q

Cons of vertical scaling

A
  • more powerful resources might be too expensive
  • resource capacity is limited
  • replacement of resources cause service interruption
26
Q

What is horizontal scaling/ scaling out?

A
  • capacity increase of service by creating more instances (assumption = each service instance comes with its own resources)
27
Q

What are the pros of horizontal scaling?

A
  • no requirement for more powerful hardware
  • provides a long term solution for scaling
28
Q

What are the cons of horizontal scaling?

A
  • increased amount of resources comes with more management overhead
  • horizontal scaling requires a distributed software architecture
29
Q

What is an auto-scaler?

A

System that defines how many servers (resources) are provided to the application. The monitor (e.g. cloud watch) measures metrics from servers which are then provided to the auto-scaler.

30
Q

What is the autoscaling policy about?

A

The autoscaling system uses this to adapt the amount of resources

31
Q

3 autoscaling approaches

A
  1. Reactive
  2. Scheduled
  3. Predictive
32
Q

What is reactive autoscaling?

A
  • detect under/overloaded service
  • scale in/out or down/up according to policy
33
Q

What is scheduled autoscaling?

A

-policy specifies scaling events (time-stamped scaling actions)
- apply scaling actions at appropriate time

34
Q

What is predictive autoscaling?

A
  • continuously predict future workloads
  • if workloads will change, schedule scaling actions ahead in time
  • lets you circumvent scaling latency and enables more time consuming scaling decisions
35
Q

Two types of auto-scalers

A
  • resource centric
  • service centric
36
Q

What is a resource-centric auto-scaler?

A
  • scaling actions modify resources
  • services are implicitly adapted
37
Q

What is a service-centric auto-scaler?

A
  • scaling actions modify the number of service instances
  • resources are implicitly adapted
38
Q

What is AWS reactive autoscaling?

A

resource centric, scaling the number of VMs

39
Q

What is the AWS Auto Scaling Group?

A
  • set of VMs with same launch template
  • contains a collection of EC2 instances (virtual servers) that are treated as a logical grouping for the purpose of automatic scaling and management.
  • ## optionally have a load balancer to scale out by creating more instances of the launch template
40
Q

AWS Scaling Policies

A
  • target tracking scaling
  • simple scaling
  • step scaling
41
Q

What is target tracking scaling?

A
  • automatically adjust resources to meet target
42
Q

What is simple scaling?

A
  • trigger based on: metric, threshold, condition
    e.g. metric > threshold: - #VMs

e.g. we want a CPU load of 50% if it is higher we scale out and if it goes below 50% we scale in. (you increase by a fixed number of #VMs or a fixed percentage once the threshold is passed).

43
Q

What is step scaling?

A
  • depends on amount of breach
    specify metric, threshold, steps based on amount
    0 to 10%: 0%
    10 to 20%: 10%
    20 to infinity%: 30%
    0 to minus infinity%: 10%