- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Delayed branch
qGood news
Ø Just 1 cycle to figure out what the right
branch address is Ø So, not 2 or 3 cycles of potential NOP or stall
qStrange news
q Compiler
Ø Can improve the performance by coding the most
frequent case in the taken path.
1.
Pipeline status for predictBranch is taken: 1 stall
44 BEQ R1, 24 48 AND R12, R2, R5 72 LW R4, 50(R7) 76 80 IF ID IF EX idle IF MEM idle ID IF WB idle EX ID IF idle MEM EX ID WB MEM EX
assume the branch to be taken and begin fetching and executing at the target. branch outcome.
Ø Only useful when the target is known before the Ø No advantage at all for MIPS 5-stage pipeline.
1.
Predict –taken
q Hardware
Ø Treat every branch as taken (evidence: more than
60% braches are taken)
Ø As soon as the branch target address is computed,
1.
Recall: solve the hazard by inserting
2 48 or472
48 or 72
1.
The pipeline status
Branch instruction Branch Successor Branch successor+1 Branch successor+2 Branch successor+3 IF ID IF EX MEM WB idle IF idle ID IF EX ID IF stall stall
qCompiler:
Ø Can improve the performance by coding the
1. the untaken path. most frequent case in
Alternative is assuming the branch always
q Most branches(60%) are taken, so we should make
corrupt machine state
1.
Move the Branch Computation
1.
Move the Branch Computation more
store load
1.
Result: New & Improved MIPS
•Need just 1 extra cycle after the BEQ branch to know right address •On MIPS, its called - the branch delay slot
40 ADD R30,R30,R30 44 BEQ R1, 24 48 AND R12, R2, R5 52 OR R13, R6, R2 IF ID IF EX ID IF MEM EX ID IF WB MEM EX ID WB MEM WB EX MEM
Branch is taken: 1 stall
Ø Predict-taken
¡ Treat every branch as taken
Ø Delayed branch
q Note:
Ø Fixed hardware Ø Compile time scheme using knowledge of hardware
scheme and of branch behavior
half of the ideal performance.
1.
Always Stall Hurts the Not- taken
2 4
1.
How about assume Branch Not Taken
1.
What If Branch Was Taken…?
1.
How to do with the branch taken ?
q Consequences
Ø We mistakenly start executing the wrong
instructions
Ø To repair this, must make sure that they DO NOT
really execute
Ø In particular, must ensure they do not incorrectly
1.
Control hazard
1.
Dealing with the control hazard
q Four simple solutions
Ø Freeze or flush the pipeline Ø Predict-not-taken (Predict-untaken)
¡ Treat every branch as not taken
software scheduling, otherwise have to stall
Ø Control hazards ¡ branch condition and the branch PC are not available in time to fetch an instruction on the next clock
Ø Data hazards
¡ Instruction depends on result of prior computation which
is not ready (computed or stored) yet
¡ OK, we did these, Double Bump, Forwarding path,
48 or 72
1.
Flushing : need only to insert one stall
48 or 72
40 ADD R30,R30,R30 44 BEQ R1, 24 48 AND R12, R2, R5 48 or 72
IF
ID IF
EX ID IF
MEM EX idle IF
WB MEM idle ID WB idle EX idle MEM
Ø OK, it’s always 1 cycle, and we always have to
wait Ø And on MIPS, this instruction always executes, no matter whether the branch taken or not
1.
Branch delay slot
1.
Stalls greatly hurt the
qProblem:
Ø With a 30% branch frequency and an ideal
CPI of 1, how much the performace is by inserting stalls ?
qAnswer:
Ø CPI = 1+30%×3=1.9 Ø this simple solution achieves only about
24
1.
Predict –not-taken
qHardware:
Ø Treat every branch as not taken (or as the
formal instruction)
¡ When branch is not taken, the fetched
instruction just continues to flow on. No stall at all. ¡ If the branch is taken, then restart the fetch at the branch target, which cause 3 stall. (should turn the fetched instruction into a noop)
qControl hazards can cause a greater greater performance loss for MIPS pipeline than do data hazards.
1.
Example: Branches
R3, 24
1.
Recall: Basic Pipelined
q Hence the name: branch delay slot
Branch is not taken: 3 stall
44 BEQ R1, 24 48 AND R12, R2, R5 72 LW R4, 50(R7) 76 48 AND R12, R2, R5
1.
IF
ID IF
EX idle IF
MEM idle ID IF
WB idle idle idle IF idle idle idle ID idle idle EX
1.
The Control hazard
qCause
Ø branch condition and the branch PC are not