Predication using select-bits instruction

The select-bits (selb) instruction is the key to eliminating branches for simple control-flow statements (for example, if and if-then-else constructs). An if-then-else statement can be made branchless by computing the results of both the then and else clauses and using select bits (selb) to choose the result as a function of the conditional.

If computing both the results costs less than a mispredicted branch, then there are additional savings.

For example, consider the following simple if-then-else statement:
unsigned int a, b, c;
 …
if (a > b)   d += a;
else         d += 1;
This code sequence, when directly converted to an SPU instruction sequence without branch optimizations, would look like:
	clgt		cc, a, b
	brz		cc, else
then:
	a		d, d, a
	br		done
else:
	ai		d, d, 1
done:
Using the select bits instruction, this simple conditional becomes:
	clgt   cc, a, b                   /* compute the greater-than condition */
	a      d_plus_a, d, a             /* add d + a */
	ai     d_plus_1, d, 1             /* add d + 1 */
	selb   d, d_plus_1, d_plus_a, cc  /* select proper result */
This example shows:

Here is another example of using the select bit — this time with C intrinsics. This code fragment shows how to use SPU intrinsics, including spu_cmpgt , spu_add , and spu_sel , to eliminate conditional branches.

The following sequence generates four instructions, assuming a, b, c are already in registers (because we are promoting and extracting to and from the preferred integer element, the spu_promote and spu_extract intrinsics produce no additional instructions):
        unsigned int a,b,c; 
        vector unsigned int vc1, vab, va, vb, vc; 

        va = spu_promote(a, 0); 
        vb = spu_promote(b, 0); 
        vc = spu_promote(c, 0); 
        vc1 = spu_add(vc, 1); 
        vab = spu_add(va, vb); 
        vc  = spu_sel(vab, vc1, spu_cmpgt(va, vb)); 
        c = spu_extract(vc, 0);