If a branch or CALL is done on a SPARC, the new address is loaded into the nPC, not the PC.
Effect: The instruction following the branch is executed before the branch takes effect.
The position immediately following any branch or call instruction is called the "delay slot", and the instruction in that position is the "delay instruction".
Example:
Addr Code ---- ---------------------- 1000 addcc %g0,%g0,%g0 1004 be where 1008 subcc %g0,123,%L1 100C st %L1,[%i0+%i1] ... 2000 where: add %L1,%G0,%L2 2004 ld [%i0+%i1],%L3
Effect:
PC nPC What's happening ---- ---- ----------------------- 1000 1004 addcc executing, be being fetched 1004 1008 be executing, subcc being fetched 1008 2000 subcc executing, add being fetched 2000 2004 add executing, ld being fetched
***ALWAYS FILL THE DELAY SLOT!!***
What can be put into the delay slot?
What MUST NOT be put into the delay slot?
Example 1:
Loop: ... ... add %L1,%L2,%L1 !Add %L2 to the sum in %L1 add %L3,1,%L3 !Increment the counter ba Loop !Back to the Loop again nop !The delay slot
should not be done that way. Instead, use:
Loop: ... ... add %L1,%L2,%L1 !Add %L2 to the sum in %L1 ba Loop !Back to the Loop again add %L3,1,%L3 !Increment the counter (delay slot)
Example 2:
Loop: add %L3,1,%L3 !Increment the counter ... subcc %L2,%L3,%G0 !Compare %L2 to %L3 bne Loop !Loop until they are equal nop !The delay slot
should not be done that way. Instead, use:
add %L3,1,%L3 !Increment the counter Loop: ... subcc %L2,%L3,%G0 !Compare %L2 to %L3 bne Loop !Loop until they are equal add %L3,1,%L3 !Increment the counter
This one is better, because the number of instructions inside the loop has been decreased by 1. However, this version increments %L3 once too often. If the value of %L3 is not important after the loop is finished, then that it not a problem. If the value of %L3 is important, then you could add the instruction
sub %L3,1,%L3 !Undo the last addition.
This looks like a poor solution, but it is better to do an extra instruction once after the end of a loop than an extra instruction every time through the loop.
(There is also another solution to this problem - the "bne,a" instruction - but we will not cover it in this course.)
Example 3: This program contains a bug:
Loop: set Array,%L4 !Put Array's address in %L4 ... subcc %L2,%L3,%G0 !Compare %L2 to %L3 bne Loop !Loop until they are equal ! ------------------- ! Another line has been finished. ! Increment the line counter in %G1 !-------------------- add %G1,1,%G1 ...
The correct code is:
Loop: set Array,%L4 !Put Array's address in %L4 ... subcc %L2,%L3,%G0 !Compare %L2 to %L3 bne Loop !Loop until they are equal nop ! ------------------- ! Another line has been finished. ! Increment the line counter in %G1 !-------------------- add %G1,1,%G1 ...
Without the "nop" in the delay slot, the "add" will be done every time through the loop (the add will be in the delay slot). Since we can't put the "subcc" in the delay slot (the branch depends on it), and we can't put the "set" in the delay slot either, we'll have to settle for a "nop".