5.0 Branch prediction Flashcards by E Mi

What stage is branch prediction a part of?

Instruction fetch.
This is to allow the stage to fetch as many, and as useful, instructions as possible at the same time.

How well did you know this?

Not at all

Perfectly

What two things does branch prediction do?

Predict if branch is taken, and the target address

How well did you know this?

Not at all

Perfectly

How does pipeline depth affect cost of mispredicted branches?

Cost is proportional to pipeline depth

How well did you know this?

Not at all

Perfectly

What are the two ways of doing branch direction prediction

Statically: Before program execution, done by compiler/programmer inserting a hint

Dynamically: during program execution, use global and local history to make predictions, done in hardware

How well did you know this?

Not at all

Perfectly

What are some advantages with static branch prediction?

Easy to implement, little hardware needed.

How well did you know this?

Not at all

Perfectly

What are a disadvantage with static branch prediction

Prediction is the same regardless of dynamic behaviour

How well did you know this?

Not at all

Perfectly

What are the three types of static prediction?

Rule, program, profile-based

How well did you know this?

Not at all

Perfectly

Define rule-based static prediction

Implement a rule that is always used, in the pipeline

Rules:
- Not taken (sequential fetch)
- Always taken (requires compute branch target, a little more complex, branch target known at decode stage)
- Back taken forward not taken (combine these two, loops are typically taken)

How well did you know this?

Not at all

Perfectly

Define program-based prediction

Typically more accurat then rule-based

requires a hint in opcode, specifying behaviour for different branch-types:

If branch is loop - take
if compare pointer to NULL - not take
if compare pointers - take branch to inequality

How well did you know this?

Not at all

Perfectly

What is profile based prediction?

When you first compile and execute a binary with a certain set of training input data.

This run collects profile data that defines how branching is taken statistically

Use this to predict branches. over 50% - taken, under - not taken

How well did you know this?

Not at all

Perfectly

What context can dynamic branch prediction use to predict branches?

Local history: The branch’s own history

Global history: Other branches’ history

How well did you know this?

Not at all

Perfectly

what is a bimodal predictor

The hash of the PC is used as a pointer into a buffer holding a counter that defines branch outcomes

Using a two bit predictor, predict taken if 11 or 10, and not taken if 01 or 00. if branch was actually taken +1, if not taken -1.

How well did you know this?

Not at all

Perfectly

What are the components of 2-level predictors?
How does a global 2-level predictor work?

Uses two levels of branch history to make predictions (bimodal only uses the branch address)

Two level use local history or global history

History: Outcome of prior branch executions

History acts as shift register. Newest outcome is inserted, oldest is discarded

1st level for a global predictor is a Branch History Register containing h bits and the global history. h bits shows branch outcome of last h branches

Then take the hash of the branch address and concatenate with the global history and use this as index to a pattern history table containing counters.

How well did you know this?

Not at all

Perfectly

Why does a two-level global predictor work?

Because there is correlation between branches.

If a branch is taken - it is probable that the next also will be taken

How well did you know this?

Not at all

Perfectly

What is a two level predictor using local history?

Have a 1st level branch history table containing local history.

Hash the PC and use this as index to the BHT.

Concat the PC hash with the entry from BHT and index into pattern history table that contains counter.

How well did you know this?

Not at all

Perfectly

Why does a two level local predictor work?

Study These Flashcards

Branches have local behaviour (alternating, loops)

How does the number of bits used in the PC hash affect the branch predicor implementation-type?

Study These Flashcards

m: number of bits retrieved from the PC, used to index pattern history table

if m is 0: all branches share the same pattern history table - this would be a global history table

if m is not 0: per-address pattern history table

What is a gshare predictor

Study These Flashcards

Instead of concatinating the h bits from the global history with the m bits from the branch address, these are XORed

What are some causes of mispredictions when using dynamic prediction?

Study These Flashcards

Some branches are just hard to predict

Can be caused by interference: predictor is to small for branches - branches with different behaviour may be mapped onto the same PHT entry

Branch behaviour doesn’t match branch predictor type. Global predictor, but branch doesn’t use global state

What is hybrid branch prediction?

Study These Flashcards

combine different type of branch predictors, and learn which ones are most accurate.

What is a tournament predictor?

Study These Flashcards

PC hash is used to index into the metapredictor table. This table contains a counter that records the behaviour of the predictors used. The metapredictor entry is used to choose what predictor result to use. If entry < 2 P1 is used, entry >= 2 P2 is used.

The metapredictor counter is updated on every prediction
- if both predictors correct/incorrect - do nothing
- if P1 correct - decrement
- if P2 correct - increment

Entries in P1 and P2 are updated every prediction, regardless of which one is used

What is a BTB (Branch target buffer) or Branch Target Address Cache (BTAC)

Study These Flashcards

Input: branch address (PC)
Output: Target address

Accessed in the same cycle as we access the instruction cache

If an instruction is a branch, we use the target address as PC in next cycle

Used in IF stage. If target is cached, we are able to continue fetching without loosing cycles

Target is used to speculatively fetch in next cycle

What is a RAS (return address stack)

Study These Flashcards

Have a circular buffer where you push the return address (PC + 4) when doing a function call. Pop address upon return.

It is easy to predict function calls, not return address. This is because multiple addresses can call a function - and therefore be a possible return address

If depth of calls exceeds the RAS-size, we will have problems

What is speculative execution?

Study These Flashcards

Predict branch target address and start execution along predicted path.

there might be multiple branches in flight along the predicted path - to keep track of this, tags are added to the instructions along the given paths.

What happens when a branch is correctly predicted?

Tags are deallocated instructions become non-speculative

What happens when a branch is mispredicted?

Wrong path instructions are nullified Start fetching instructions along correct path

What is the structure of the branch target buffer (BTB)

PC is used as input(tag) to table If we hit, take the target address and use it speculatively as the next PC to fetch in next cycle

How does a BTB/RAS access work in hardware?

On function call, take branch-address + 4 and store into RAS. In parallell, you access the BTB with the branch address to get the target address. I the branch is a return, instead of taking the result from the BTB, pop the top most element of the RAS, and use this as the target address.

How does return address prediction work in the pipeline?

Fetch stage: - Use branch address (comes from PC) to access BTB/RAS - If you hit in the BTB use this as target address, if not, continue fetching continually (PC + 4) - (BTB hit is used as control signal in mux deciding between target address and PC +4) Decode stage: - Perform branch prediction - gives taken/not-taken - Perform address calculation - gives the actual branch target - If the branch is taken, and the predicted target address from the fetch stage is wrong - use the calculated address (causes stall for a cycle) If BTB/RAS address != calculated address - causes stall and will need to insert a bubble

What is speculative branch execution?

Predict target address and start fetching and executing along this path. There might be multiple branches in flight during this. To keep track of each branch, a tag is added to each speculatively taken execution path

What happens when a branch evaluates to correctly predicted

Tags are deallocated Instructions are set as non-speculative

What happens when a branch evaluates to mispredicted

Instructions along the wrong path are nullified. Start fetching along the correct path.

5.0 Branch prediction Flashcards

(32 cards)