The sections included here describe some coding methods, with examples
in SPU assembly language, C language, SPU C-language intrinsics, and MFC commands,
or in a combination thereof.
These instruction and command sets are summarized in:
DMA transfers
DMA commands transfer data between the LS and main storage.
DMA-list transfers
A DMA list is a sequence of transfer elements (or list elements) that, together with an initiating DMA-list command, specifies a sequence of DMA transfers between a single area of LS and possibly discontinuous areas in main storage.
Moving double-buffered data
SPE programs use DMA transfers to move data and instructions between main storage and the local store (LS) in the SPE.
Vectorizing a loop
A compiler that automatically merges scalar data into a parallel-packed SIMD data structure is called an auto-vectorizing compiler. Such compilers must handle all the high-level language constructs, and therefore do not always produce optimal code.
Reducing the impact of branches
The SPU hardware assumes linear instruction flow, and produces no stall penalties from sequential instruction execution. A branch instruction has the potential of disrupting the assumed sequential flow.