************************************************************************ * Cluster Computing - CSC 591c * * Homework 4 * * * * Anwar Ali, Annika Edwards, Nhon Nguyen * * April 10, 2003 * ************************************************************************ * * * Evaluation and Extension of MPIP: Lightweight, Scalable MPI * * Profiling Tool by using The IRS Benchmark Code (IRS) * ************************************************************************ Introduction ------------ Parallel programs are designed so that a speedup comparable to the number of processors that are available can be achieved. According to Amdal's Law this speed up is limited by the fraction of the program that is sequential. In addition to this law, the time spent communicating data can also limit speedup. To increase speedup the communication time must be reduced. In order for a programmer to do this, a tool is needed that can point to where a parallel program is spending a lot of time communicating. Jeffrey S. Vetter of Lawrence Livermore National Laboratory has produced such a tool for programs paralelized using MPI. "MPIP is a lightweight profiling library for MPI applications." It shows how much time is being spent executing MPI calls. Data is collected for each process and for each call site and aggregated into one output file. MPIP does not add considerable execution time to a program. Problem Statement ----------------- In this project we seek to understand fully how MPIP is used, and how the output can help us to determine where communication time should be reduced. For example, if a large amount of time is being spent on an MPI_BARRIR call in several processes, this may indicate a load imbalance. Once gaining an understanding we need to determine how MPIP can be improved and impliment this improvement. We will use the IRS Benchmark Code (IRS) which executes on both SMP and multi-node systems, to measure and compare the performace of a large application on the cluster. This will help us understand more about large scale applications on SMP machines and parallel architectures. Outline ------- - Install and configure MPIP - Deloy the IRS Benchmark Code into the cluster. - We will then run MPIP on the IRS benchmark to determine where the communication problems are, if any, and what the causes could be. We will then attempt to fix any problems we find. - Running the Code for different cases * Sequential * Threads, * MPI Parallel * MPI Parallel and Threads Parallel - Collect and analise data to learn how the application performs across all the above scnarios. - Make futher study for any improvement of the IRS and MPIP. This will take the majorty of our time. Once we determine what can be done, we will impliment our solution. References ---------- mpiP: Lightweight, Scalable MPI Profiling http://www.llnl.gov/CASC/mpip/ The IRS Benchmark Code http://www.llnl.gov/asci/purple/benchmarks/limited/irs/ Parallel Implicit Solvers for Radiation Transport Systems http://research.nianet.org/~dimitri/ASCI/ Statistical Scalability Analysis of Communication Operations in Distributed Applications, Jeffrey S. Vetter, Michael O. McCracken,Proc. ACM SIGPLAN Symp. Principles and Practice of Parallel Programming(PPOPP, 2001) URL: http://llnl.gov/CASC/people/vetter/people/pubs/ppopp01_scal_analysis.pdf An Empiracal Performance Evaluation of Scalable Scientific Applications, Jeffrey S. Vetter,Andy Yoo,Supercomputing Conf. Tech Paper(2002) URL: http://sc-2002.org/paperpdfs/pap.pap222.pdf Project Web Page ---------------- http://www4.ncsu.edu/~aredward/csc591c/index.html