5.0 Branch prediction Flashcards

1
Q

What stage is branch prediction a part of?

A

Instruction fetch.
This is to allow the stage to fetch as many, and as useful, instructions as possible at the same time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What two things does branch prediction do?

A

Predict if branch is taken, and the target address

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How does pipeline depth affect cost of mispredicted branches?

A

Cost is proportional to pipeline depth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the two ways of doing branch direction prediction

A

Statically: Before program execution, done by compiler/programmer inserting a hint

Dynamically: during program execution, use global and local history to make predictions, done in hardware

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are some advantages with static branch prediction?

A

Easy to implement, little hardware needed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are a disadvantage with static branch prediction

A

Prediction is the same regardless of dynamic behaviour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the three types of static prediction?

A

Rule, program, profile-based

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define rule-based static prediction

A

Implement a rule that is always used, in the pipeline

Rules:
- Not taken (sequential fetch)
- Always taken (requires compute branch target, a little more complex, branch target known at decode stage)
- Back taken forward not taken (combine these two, loops are typically taken)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define program-based prediction

A

Typically more accurat then rule-based

requires a hint in opcode, specifying behaviour for different branch-types:

  • If branch is loop - take
  • if compare pointer to NULL - not take
  • if compare pointers - take branch to inequality
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is profile based prediction?

A

When you first compile and execute a binary with a certain set of training input data.

This run collects profile data that defines how branching is taken statistically

Use this to predict branches. over 50% - taken, under - not taken

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What context can dynamic branch prediction use to predict branches?

A

Local history: The branch’s own history

Global history: Other branches’ history

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is a bimodal predictor

A

The hash of the PC is used as a pointer into a buffer holding a counter that defines branch outcomes

Using a two bit predictor, predict taken if 11 or 10, and not taken if 01 or 00. if branch was actually taken +1, if not taken -1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the components of 2-level predictors?
How does a global 2-level predictor work?

A

Uses two levels of branch history to make predictions (bimodal only uses the branch address)

Two level use local history or global history

History: Outcome of prior branch executions

History acts as shift register. Newest outcome is inserted, oldest is discarded

1st level for a global predictor is a Branch History Register containing h bits and the global history. h bits shows branch outcome of last h branches

Then take the hash of the branch address and concatenate with the global history and use this as index to a pattern history table containing counters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why does a two-level global predictor work?

A

Because there is correlation between branches.

If a branch is taken - it is probable that the next also will be taken

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a two level predictor using local history?

A

Have a 1st level branch history table containing local history.

Hash the PC and use this as index to the BHT.

Concat the PC hash with the entry from BHT and index into pattern history table that contains counter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why does a two level local predictor work?

A

Branches have local behaviour (alternating, loops)

17
Q

How does the number of bits used in the PC hash affect the branch predicor implementation-type?

A

m: number of bits retrieved from the PC, used to index pattern history table

if m is 0: all branches share the same pattern history table - this would be a global history table

if m is not 0: per-address pattern history table

18
Q

What is a gshare predictor

A

Instead of concatinating the h bits from the global history with the m bits from the branch address, these are XORed

19
Q

What are some causes of mispredictions when using dynamic prediction?

A

Some branches are just hard to predict

Can be caused by interference: predictor is to small for branches - branches with different behaviour may be mapped onto the same PHT entry

Branch behaviour doesn’t match branch predictor type. Global predictor, but branch doesn’t use global state

20
Q

What is hybrid branch prediction?

A

combine different type of branch predictors, and learn which ones are most accurate.

21
Q

What is a tournament predictor?

A

PC hash is used to index into the metapredictor table. This table contains a counter that records the behaviour of the predictors used. The metapredictor entry is used to choose what predictor result to use. If entry < 2 P1 is used, entry >= 2 P2 is used.

The metapredictor counter is updated on every prediction
- if both predictors correct/incorrect - do nothing
- if P1 correct - decrement
- if P2 correct - increment

Entries in P1 and P2 are updated every prediction, regardless of which one is used

22
Q

What is a BTB (Branch target buffer) or Branch Target Address Cache (BTAC)

A

Input: branch address (PC)
Output: Target address

Accessed in the same cycle as we access the instruction cache

If an instruction is a branch, we use the target address as PC in next cycle

Used in IF stage. If target is cached, we are able to continue fetching without loosing cycles

Target is used to speculatively fetch in next cycle

23
Q

What is a RAS (return address stack)

A

Have a circular buffer where you push the return address (PC + 4) when doing a function call. Pop address upon return.

It is easy to predict function calls, not return address. This is because multiple addresses can call a function - and therefore be a possible return address

If depth of calls exceeds the RAS-size, we will have problems

24
Q

What is speculative execution?

A

Predict branch target address and start execution along predicted path.

there might be multiple branches in flight along the predicted path - to keep track of this, tags are added to the instructions along the given paths.

25
Q

What happens when a branch is correctly predicted?

A

Tags are deallocated
instructions become non-speculative

26
Q

What happens when a branch is mispredicted?

A

Wrong path instructions are nullified
Start fetching instructions along correct path

27
Q

What is the structure of the branch target buffer (BTB)

A

PC is used as input(tag) to table

If we hit, take the target address and use it speculatively as the next PC to fetch in next cycle

28
Q

How does a BTB/RAS access work in hardware?

A

On function call, take branch-address + 4 and store into RAS.

In parallell, you access the BTB with the branch address to get the target address.

I the branch is a return, instead of taking the result from the BTB, pop the top most element of the RAS, and use this as the target address.

29
Q

How does return address prediction work in the pipeline?

A

Fetch stage:
- Use branch address (comes from PC) to access BTB/RAS
- If you hit in the BTB use this as target address, if not, continue fetching continually (PC + 4)
- (BTB hit is used as control signal in mux deciding between target address and PC +4)

Decode stage:
- Perform branch prediction - gives taken/not-taken
- Perform address calculation - gives the actual branch target
- If the branch is taken, and the predicted target address from the fetch stage is wrong - use the calculated address (causes stall for a cycle)

If BTB/RAS address != calculated address - causes stall and will need to insert a bubble

30
Q

What is speculative branch execution?

A

Predict target address and start fetching and executing along this path.

There might be multiple branches in flight during this.

To keep track of each branch, a tag is added to each speculatively taken execution path

31
Q

What happens when a branch evaluates to correctly predicted

A

Tags are deallocated
Instructions are set as non-speculative

32
Q

What happens when a branch evaluates to mispredicted

A

Instructions along the wrong path are nullified.
Start fetching along the correct path.