Figure: Bins generated for synthetic input span entire range with similar sample counts
|
Idea: preserve time in compressed traces
- Encode time deltas instead of timestamps
- Create delta histograms automatically
- Dynamically balance histograms
|
Number of histograms per record depends on the number of possible call paths
|
Path-sensitive histograms
- Time depends on path taken
- Distinguish histograms by path
Sample:
MPI_Allreduce (..);
for (..) {
for (..) {
MPI_Send (..);
MPI_Recv (..);
}
MPI_Barrier (..);
}
|
Sample bimodal distribution from UMT2k collectives
|
- Histograms detect imbalances
- Variable sizes capture variance
|
Trace sizes (NAS Benchmarks and UMT2K)
|
The benchmarks fall into three categories:
- near-constant trace sizes, e.g. DT, EP, LU
- sub-linear trace sizes, e.g. CG, MG, FT
- non-scalable trace sizes, e.g. BT, IS, UMT2k
|
Replay Accuracy (NAS Benchmarks and UMT2K)
|
The benchmarks fall into three categories:
- accurate replay: DT, EP, FT, LU, IS, UMT2k
- Replay inaccurate in MPI time: CG, MG
- Replay inaccurate in compute time: BT
|