SPAN: Shared-Memory Performance Analysis

funded by: Lawrence Livermore National Laboratory
funding level: $76,999
duration: 01/31/2002 - 01/31/2003 (extended to 2/28/2003)
PI: Frank Mueller

This work addresses problems in exploiting the memory bandwidth of shared-memory multiprocessors (SMPs) for scientific applications. For contemporary high-performance clusters of SMPs, it has been found that a number of scientific applications utilizing a mixed mode of MPI+OpenMP are performing worse than when relying on MPI, only. Considering that the architectural model of SMPs seems to be a close fit to the OpenMP threading model, this performance gap seems particularly surprising. The objective of this proposal is to determine the sources of inefficiencies in utilizing memory hierarchies for threaded programs vs. parallel processes and to assist the programmer in alleviating these problems. The methodology to perform this analysis relies on binary rewriting.

Preliminary results are reported on the binary rewiting framework to extract Partial Data Traces from running programs.
Preliminary results on optimizations are reported in Detecting Memory Performance Bottlenecks via Binary Rewriting.
A more comprehensive write-up is given in METRIC: Tracking Down Inefficiencies in the Memory Hierarchy via Binary Rewriting.