The select-bits (selb) instruction is the key to eliminating branches for simple control-flow statements (for example, if and if-then-else constructs). An if-then-else statement can be made branchless by computing the results of both the then and else clauses and using select bits (selb) to choose the result as a function of the conditional.
If computing both the results costs less than a mispredicted branch, then there are additional savings.
unsigned int a, b, c; … if (a > b) d += a; else d += 1;
clgt cc, a, b brz cc, else then: a d, d, a br done else: ai d, d, 1 done:
clgt cc, a, b /* compute the greater-than condition */ a d_plus_a, d, a /* add d + a */ ai d_plus_1, d, 1 /* add d + 1 */ selb d, d_plus_1, d_plus_a, cc /* select proper result */
Here is another example of using the select bit — this time with C intrinsics. This code fragment shows how to use SPU intrinsics, including spu_cmpgt , spu_add , and spu_sel , to eliminate conditional branches.
unsigned int a,b,c; vector unsigned int vc1, vab, va, vb, vc; va = spu_promote(a, 0); vb = spu_promote(b, 0); vc = spu_promote(c, 0); vc1 = spu_add(vc, 1); vab = spu_add(va, vb); vc = spu_sel(vab, vc1, spu_cmpgt(va, vb)); c = spu_extract(vc, 0);