SPU performance and semantics

The simulator collects several statistics related to SPU performance.

Table 1 lists the performance statistics that are available in the public SDK.
Table 1. Simulator Performance Statistics for the SPU
Statistic Name Meaning
performance_inst_count Instruction count (profile checkpoint sensitive), including and not including no-ops.
performance_cycle_count Cycle count (profile checkpoint sensitive).
branch_taken Count of branch instructions taken.
branch_not_taken Count of branch instructions not taken.
hint_instructions Count of branch hint instructions.
hint_instruction_hits Number of times a hint instruction predicted correctly.
ls_contention Number of cycles in which local store load/store instructions prevented prefetch.
sbi_contention Number of cycles in which the Synergistic Bus Interface (SBI) DMA operations prevented SPU local store access.
single_cycle Number of cycles in which only one pipeline executed an instruction.
dual_cycle Number of cycles in which both pipelines executed an instruction.
sp_issue_block Number of cycles in which dual-issue was prevented, due to an SP-class instruction not being available to issue.
dp_issue_block Number of cycles in which dual-issue was prevented, due to a DP-class instruction not being available to issue.
cross_issue_cycle Number of cycles in which issue pipe{0,1} sent an instruction to the opposite issue pipe{1, 0}.
nop_inst_count Number of NOP instructions executed (NOP, LNOP, HBR, and HBC).
src0_dep_cycle Number of cycles in which dual-issue was prevented, due to operand dependencies between the two instructions that were ready to issue simultaneously.
nop_cycle Number of cycles in which a NOP was executed in either pipeline.
branch_stall_cycles Number of cycles stalled due to branch miss.
prefetch_miss_stall_cycles Number of cycles instruction issue stalled due to prefetch miss.
pipe_dep_stall_cycles Number of cycles instruction issue stalled, due to source operand dependencies on target operands in any execution pipeline.
pipe_busy_cycles Number of cycles all execution pipelines were expected to be busy processing in-flight instructions (unaffected by flush).
fp_resource_conflict_stall_cycles Number of cycles stalled due to floating-point unit resource conflict.
hint_stall_cycles Number of cycles stalled due to waiting for hint target.
siss_stall_cycles Number of cycles stalled due to structural execution pipe dependencies.
channel_stall_cycles Number of cycles stalled waiting for a channel operation to complete.
XXX_inst_count (see below) Number of XXX instructions executed.
XXX_dep_stall_cycles (see below) Number of cycles stalled due to a source operand dependency on a target operand of an in-flight instruction in the XXX execution pipeline.
XXX_iss_stall_cycles (see below) Number of cycles stalled due to a structural dependency on an XXX class instruction.
XXX_busy_cycle (see below) Total cycles the XXX execution pipeline was expected to be busy processing in-flight instructions (unaffected by flush).
Where XXX (above) is one of:
FX2 SPX fixed-point unit (fixed [FX] class) instructions.
SHUF SFS shuffle and quad-rotate fixed-point unit (shuffle [SH] class) instructions.
FX3 SFX 4-cycle fixed-point unit (word rotate and shift [WS] class) instructions.
LS SLS load and store unit (load and store [LS] class) instructions.
BR SCN branch and control unit and sequencer (branch resolution [BR] class) instructions.
SPR SSC Channel and DMA unit (channel interface [CH] class) instructions.
LNOP Odd pipeline (load no operation [LNOP] class) no-ops.
NOP Even pipeline (NOP class) no-ops.
FXB SFP byte operations (byte operations [BO] class) instructions.
FP6 SFP FPU single-precision (single-precision floating-point [SP] class) instructions.
FP7 SFP integer (floating-point integer [FI] class) instructions.
FPD SFP FPU double-precision (double-precision floating-point [DP] class) instructions.