L8 - Cluster Management Flashcards

Question 1

Q

What is resource allocation?

Answer

A

how much CPU/DRAM/disk/net to allocate to each app

Question 2

Q

What is resource assignment?

Answer

A

What should run on which physical nodes?

Question 3

Q

What is private resource allocation? What is its other name?

Answer

A

each app receives a private, static set of resources

static partitioning

Question 4

Q

Advantages of static partitioning?

Answer

A

simplicity
performance isolation
allows specialised HW (e.g. not everyone needs a GPU)

Question 5

Q

Disadvantages of static partitioning?

Answer

A

low utilisation
hard to solve failures
hard to maintain

about 2&3: not clear how to migrate a machine

Question 6

Q

What 3 properties do we want the scheduler to fulfil in case of shared resource assignment?

Answer

A

Fairness
Efficient resource usage
Isolation

Question 7

Q

List the algorithm from the lecture for shared resource assignment.

Answer

A

Fair queueing (extends 1) (for a single resource)
Weighted max-min fair queueing (extends 2)
Dominant resource fairness
Token bucket

Question 8

Q

What does work conserving mean? Which property implies that?

Answer

A

Resources should not remain idle while there are users whose demand is not fully satisfied.

This is implied by “Efficient resource usage”.

Question 9

Q

Why do we want work conserving schedulers?

Answer

A

It keeps resources well-utilised.

It maximises overall throughput across different users.

Question 10

Q

Name a strategy that is not work conserving.

Answer

A

time division multiplexing

Question 11

Q

What are the different notions of fairness?

Answer

A

Max-min fairness
Dominant resource fairness

Question 12

Q

What are the properties of max-min fairness?

Answer

A

share guarantee: each user gets at least 1/n of the unless their demand is less

strategy-proof: users are not better off by asking for more than they need

Question 13

Q

What does DRF try to achieve?

Answer

A

identify the dominant resource share of each user and maximise the minimum dominant share across all users

Question 14

Q

What is the drawback of DRF?

Answer

A

not work conserving

Question 15

Q

What is the issue with max-min fairness?

Answer

A

With max-min fairness, a user’s allocation depends on the demands of other users that are sharing the resource. –> no performance predictability

Question 16

Q

What is the goal of token buckets?

Answer

A

guarantee a baseline bandwidth, but also allow bounded bursts

Question 17

Q

How does the token bucket idea work?

Answer

A

Control traffic by delaying requests until they accumulate sufficient tokens.

Question 18

Q

What does resource assignment try to optimise?

Answer

A

performance
resource utilisation

Question 19

Q

Explain the first step of resource assignment.

Answer

A

Filter machines that satisfy hard constraints

e.g., VM may need a machine with a GPU

Question 20

Q

Explain the second step of resource assignment.

Answer

A

Rank candidate nodes to find machine that best
satisfies soft constraints

e.g., best-fit to avoid resource fragmentation

Question 21

Q

List different methods for cluster management system architecture.

Answer

A

centralised
distributed
hierarchical e.g. two-level

Question 22

Q

Next questions are about Borg. First, what is Borg?

Answer

A

Google’s centralised cluster manager

Question 23

Q

What does Borgmaster do?

Answer

A

It is the main scheduler.
It polls Borglets every few seconds

extra: 5 replicas

Question 24

Q

What does Borglet do?

Answer

A

Manages and monitors tasks and resources on machines it is responsible for.

extra: 10k heterogenous machines per Borglet

Question 25

Q

What strategies does Borg deploy to achieve high utilisation?

Answer

A

admission control
efficient task-packing
over-commitment
machine sharing

Question 26

Q

What is Kubernetes?

Answer

A

Cluster management for containerised applications;

manage complexity of container lifecycle and allocating/setting up hardware resources for the containers.
like an OS for your cloud cluster

Question 27

Q

List container orchestration primitives!

Answer

A

Resource scaling
Resource allocation
Load balancing
Lifecycle and health
Naming and discovery
Storage volumes
Logging and monitoring
Debugging and introspection
Identity and authorization

Question 28

Q

Resource scaling

Answer

A

make sets of containers bigger or smaller

Question 29

Q

Resource allocation

Answer

A

decide where my containers should run

Question 30

Q

Load balancing

Answer

A

distribute traffic across a set of containers

Question 31

Q

Lifecycle and health

Answer

A

keep my containers running despite failures

Question 32

Q

Naming and discovery

Answer

A

find where my containers are now

Question 33

Q

Storage volumes

Answer

A

provide data to containers

Question 34

Q

Logging and monitoring

Answer

A

track what’s happening with my containers

Question 35

Q

Debugging and introspection

Answer

A

enter or attach to containers

Question 36

Q

Identity and authorization

Answer

A

control who can do things to my containers

Question 37

Q

What do the Kubernetes containers do?

Answer

A

Handle package dependencies

Question 38

Q

What is a pod?

Answer

A

A pod is the unit of scheduling and migration in Kubernetes.

a bunch of containers with same properties

Question 39

Q

List those properties!

Answer

A

Lifecycle: live together, die together
Network: same IP address, same routes, iptables
Storage volumes: can share data
Intended to run a common task

Question 40

Q

Kubernetes service?

Answer

A

A group of pods that work together

extra: provides load balancing among pod replicas

Question 41

Q

How do you control pod placement in Kubernetes?

Answer

A

use labels and selectors

Question 42

Q

How do you keep N pods running?

Answer

A

use ReplicaSets: layer on top of Pod API that
ensures N copies of a pod are running

Question 43

Q

What does the Horizontal Pod Autoscaler do?

Answer

A

automatically scale pods as needed
- based on CPU utilisation (or custom metrics)
- can set user-defined min/max bounds

Question 44

Q

What is a potential problem with relying only on CPU utilization as a scaling metric?

Answer

A

good for compute bound apps but maybe I/O is the bottleneck

Question 45

Q

What other metrics would you consider for auto-scaling besides CPU utilization?

Answer

A

memory capacity
memory BW
network BW

Question 46

Q

What properties does resource isolation try to achieve?

Answer

A

Applications must not be able to affect each other’s performance
Repeated runs of the same application should see similar behaviour

Question 47

Q

What are the resource allocation mechanisms in Kubernetes?

Answer

A

Request: How much of a resource (CPU, RAM) the container is asking to use, with a strong guarantee of availability

Limit: Max amount of a resource the container can access

Question 48

Q

Does the scheduler overcommit to requests?

Question 49

Q

List 3 Kubernetes Quality of Service classes.

Answer

A

Guaranteed: highest protection
Burstable: medium protection
Best effort: lowest protection

Question 50

Q

Relation of request and limit for Guaranteed class?

Answer

A

request > 0 && limit == request

Question 51

Q

Relation of request and limit for Burstable class?

Answer

A

request > 0 && limit > request

Question 52

Q

Relation of request and limit for Best effort class?

Answer

A

request == 0

Question 53

Q

What are the advantages of centralised design?

Answer

A

can make globally optimal decisions

Question 54

Q

What are the drawbacks of centralised design?

Answer

A

scalability: hard to enforce consistency

Question 55

Q

Name 2 two-level cluster managers

Answer

A

Mesos and YARN

Question 56

Q

How does Mesos work?

Answer

A

Lecture on 05.04
Min: 3.5

Question 57

Q

List two distributed cluster management algorithms.

Answer

A

Omega and Sparrow

Question 58

Q

List two new challenges serverless brings to the cluster management besides resource allocation and assignment.

Answer

A

resource scaling: How many containers (“slots”) to keep warm for a function?
request routing: To which node and “slot” do we send a particular invocation?

Question 59

Q

What does Quasar try to solve?

Answer

A

Over-provisioning

Question 60

Q

How does Quasar solve over-provisioning?

Answer

A

Don’t ask users for allocation request/resource demand.

They don’t really know it anyway.

Question 61

Q

What do the users specify in this case? (Quasar)

Answer

A

performance goals

Question 62

Q

What does the cluster manager do in this case? (Quasar)

Answer

A

profiles applications and dynamically adjusts resource allocations

Question 63

Q

How does the cluster manager understand resource/performance tradeoffs? (Quasar)

Answer

A

It combines the following:

Small signal from a short run of a new application
Large signal from previously run applications

Question 64

Q

What does the cluster manager do at the end? (still Quasar)

Answer

A

For each new application, it needs to recommend a resource allocation and assignment.

Answer 64

A

collaborative filtering

Answer 65

A

Predict preferences of new users given preferences of other users SVD and PQ reconstruction.

Answer 66

A

scale-out
scale-up
HW heterogeneity
Interference

Answer 67

A

Use 4 nodes or a single node?

Answer 68

A

Use a 8-core VM or a single core VM?

Answer 69

A

Step 1: short profiling runs produce initial performance data.

Step 2: collaborative filtering techniques fill in missing data

Step 3: Greedy scheduler uses output to find the number and type of resources that maximise utilisation and performance.

Answer 70

A

Resource allocation: how many resources should an app get?
Resource assignment: which specific resources does an app get?
Variability: within an app (different phases), within datasets, and load

Brainscape's Knowledge GenomeTM

L8 - Cluster Management Flashcards

Brainscape's Knowledge Genome^TM