NCSU CSC548 Parallel Computer Project 4
Homework 4
Project Proposal
Homework 5
Project Progress Report
Homework 6
Project Final Report
Introduction
This is a course project for NCSU CSC548 Parallel Computer . The purpose of this project is to add support for MPI I/O (MPI_File_xxx functions) to record framework and test with a small (PI program) and a large application (Parallel I/O benchmark), execute the implementation in a cluster of parallel computers environment. Then evaluate any performance impact of the added technique on Parallel I/O benchmark applications.
Problem Description
The record framework's purpose is to capture and record a compressed trace of all MPI communication performed by an MPI application for lossless replay. The record application provides hooks into every MPI call regardless of MPI implementation.
MPI I/O is new MPI standard that defines a set of routines for transferring data to and from external storage. It offers a number of advantages over traditional language I/O:
-
Flexibility - MPI I/O provides mechanisms for collective access (many processes collectively read and write to a single file), asynchronous I/O, and strided access.
-
Portability - Many platforms support the MPI I/O interface, so programs should compile and run essentially unchanged.
-
Interoperability - Files written by MPI I/O are portable between platforms.
The current version of record framework does not have support for MPI I/O which is relative new standard. A goal of this project is to expand record frameworks capability by adding support for MPI I/O routines.
Project Outline
This is relative new project topic for me and I am unfamiliar to benchmark application, I plan following steps to prepare myself to accomplish the project implementation and benchmark evaluation:
-
Study relevant project material online
-
Understand usage of benchmark application through both online material and hand-on practice skills with small application like MMUL and PI.
-
Implement MPI I/O on record tool and test on Parallel I/O benchmarks
-
Execute on small application first to ensure implementation correctness
-
Test on large Parallel I/O benchmark applications
-
Conduct and collect empirical benchmark data on Parallel I/O benchmarks before and after MPI I/O implementation
The evaluation will be based on empirical data collected like described above. The experiment will be conducted on a cluster of sixteen parallel computers. Each machine is AMD Athlon XP 1900+ dual-core processors machine with 64kB L1 I/D-split caches and a 256kB L2 unified cache.
Plan of Work
Week One (10/30/06-11/04/06)
-
Understand problem, learn architecture of record framework and semantics and usage of MPI I/O routines
-
Install MPI Trace Compression source code (/home/student/secret/record.tgz) on OSxx machine.
-
Download and run record on unmodified Parallel I/O benchmarks to obtain baseline data that will be compared with data of record with MPI I/O support
Week Two (11/05/06-11/11/06)
-
Submit progress report (Homework 5)
-
Implement MPI I/O on small application PI; execute modified PI on cluster to verify correctness
Week Three (11/12/06-11/18/06)
-
Implement MPI I/O on Parallelg I/O benchmark applications
-
Debug MPI I/O implementation
Week Four (11/19/06-11/25/06)
-
Test/Execute Parallell I/O benchmarks on cluster with MPI I/O capable record tool
-
Collect Parallel I/O benchmark data
-
Generate table/chart/graph based on collected data
Week Five (11/26/06-11/27/06)
-
Complete final project report
-
Rerun benchmarks if necessary
References
- M. Noeth, F. Mueller, M. Schulz, B. de Supinski, Scalable Compression and Replay of Communication Traces in Massively Parallel Envrionments, submitted
- :F. Mueller, MPI I/O Trace Compression presentation, 2006
- README, record.tgz
- Introduction to MPI I/O, http://www.nersc.gov/nusers/resources/software/libs/io/mpiio.php#concepts