For our final project, we plan to modify Stanford's SUIF compiler framework to generate parallel code using MPI, the de-facto standard for distributed-memory multiprocessing. While SUIF's optimizations are currently designed for shared-memory multiprocessors, we would like to show that they can be retargeted for distributed memory while maintaining acceptable performance.
In order to achieve our project goal, we plan on selecting a simple tiling example, either from [1] or [2], and studying it in both its optimized and unoptimized forms. After becoming familiar with SUIF's runtime parallel library and how well it maps to MPI, we will attempt to get the aforementioned example working as a SUIF Pass with a modified runtime library. If we find that the MPI version is inefficient due to fine-grained parallelism, we will attempt to modify SUIF's analysis passes to detect coarser-grained parallelism and compare performance.
Based on feedback from Dr. Mueller, our goals for this phase of the project were to:
For the first goal, we looked into the NAS Parallel Benchmarks [4, 5]. For most of the NAS benchmarks, OpenMP, MPI, and serial code are provided, which made it well-suited for our purposes. Among the benchmarks, MG (a multigrid solver) was selected to work from. It contains four key routines that each perform some computation, then synchronize data values between the memories of different processors with communication routines.
We also managed to successfully compile and test SUIF2 on Red Hat Enterprise Linux 3, though this was certainly nontrivial. SUIF has not been actively developed for five years, so we had to modify our build environment so that it would compile. In particular, we needed to downgrade to gcc 2.9.6, install several nonstandard libraries, and modify portions of SUIF and the Omega dependence analysis library. After building SUIF, we successfully compiled a nontrivial C file into SUIF code.
Finally, to prepare for the next major phase of the project, we studied the documentation for SUIF and learned how to write a pass so that we can implement our modifications.
In the future, we need to learn more about SUIF's dependence analysis framework, write simpler versions of the serial, MPI, and OpenMP versions of MG in C, and begin implementing our SUIF pass.