8 - Load Balancing Flashcards
Where might load balancing apply?
Master-worker parallel computation
Traffic management
Collision avoidance
Challenges of load balancing
Procedure complexity
Outdated info
Fault Tolerance
Scalability
Heterogeneous hardware - different speeds/capacity
Static load balancing
Doesn’t consider state
All set in advance
Static: Attributes
Do not change
Tasks needed to be regular
Round robin algorithm
Assign next ball to next bin then start again
Weighted round robin
A larger bin is given more balls/jobs
Round Robin General problem
State not being considered means that jobs with different durations/temporarily slow servers affect it.
Simple Randomised
For each ball, choose a random bin. No state info
Simple randomised: Examples
Client side load balancing - randomly choose ip
Hash tables - hash value can be used
Hash function with n buckets
Simple using modulo eg h(ID) = ID mod n
30 for 6 buckets = 0
11 = 5
Random vs Hash
Random cannot find same item again
Hashing; item can be found knowing ID and hash function
Dynamic Load Balancer
Based on current state.
Can move tasks where necessary
Allows greater modularity
Least loaded Bin
Server side balancer forwards request to backends.
Least time allocation
Allocation to bin that will serve at earliest time. Considers speed and queues
Multiple Dispatcher Problem
They will send their request to the same server (herd behaviour) and has opposite effect on performance.
Goren et al Multi dispatch solution
Round based algorithm. Allocate to bins based on probabilities in current round.
Need to know server queues, processing rates and job arrival rates
Dynamic Randomised
Choose multiple random bins, allocate the ball to the bin of lowest load
Power of two choices improvements
In terms of m balls and n bins
Case m = n: Exp improvement
Case m > n: Max load above avg is independent of num balls m
When does multiple-choice lose some power?
In parallel settings
Parallel Scenario
Balls arrive in batches of n
Allocated using GREEDY[d] but bin loads only updated between batches.
Decisions may be based on outdated load info
g-bounded process
Allocated using two choices, but if load difference is g or less then process makes mistake.
Gap between average load in parallel, 1 + β process and g-bounded scenario
Independent of m (number of balls)
1 + β process
(balls into bins)
Some balls allocated using 2 choices, the rest using 1.
2 bins queries, and with probability 1 - β, ball is allocated to random bin. 1+ β choice for each ball
Probability that none of m = n balls fall into particular bin
1/e