Homework 1

Deadline: see web page
Assignments: All parts are to be solved individually (turned in electronically, written parts in ASCII text, NO Word, postscript, etc. permitted unless explicitly stated).

Please use the henry2 cluster (Linux). All programs have to be written in C, translated with mpicc/gcc and turned in with a corresponding Makefile.

  1. (0 points) Learn how to compile and execute an MPI program.

    Notice: You have very limited disk space in your home directory on henry2. However, there is more disk space at /gpfs_share/csc548. Utilize it wisely as it is shared between all 548 students.

    Hints:

    Nothing to turn in, this is just a warm-up exercise.

  2. (50 points) Write an MPI program that determines the point-to-point message latency for pairs of nodes. You should exchange point-to-point messages with short message volume (less than 1KB) between any two nodes and time the round-trip time (rtt). Also report min/max times. The result/output should be a three matrices with node names (rows/columns) and min/max/rtt values in microseconds. Matrices are preceded by their respective description: min/max/rtt (in a single line). Report numbers for at least 16 different nodes. (You may try larger values if you can get your job through the queues.)

    In a README file, try to explain different values in the matrices in reference to the possible network configuration of nodes on the cluster.

    Hints:

    Turn in the files rtt.c, Makefile.rtt, rtt.out, rtt.bsub and rtt.README.

  3. (50 points) Implement the Pi approximation algorithms in three different ways: (c) with collective communication (Broadcast/Reduce, see lecture nodes), (b) with blocking point-to-point communication (Send/Receive) and (n) with nonblocking communication (Isend/Irecv/Waitall/Wait). Options (b) and (n) should have two variants: (r) rooted centralized approach (communicate with rank zero) and (t) tree-based approach (manually create a binary reduction tree rooted in rank zero and communicate along the edges to simulate the broadcast and reduction).

    Compare the performance (using MPI_Wtime) for long-running inputs (large number of intervals) for each approach with submitted jobs (to ensure low contention). Show your results and comment on the outcome in the README file.

Turn in the files pic/pibr/pibt/pinr/pint.c, Makefile.pi, and pi.README.

Hints:

What to turn in for programming assignments:

How to turn in:

Use the "Submit Homework" link on the course web page. Please upload all files individually (no zip/tar balls).

Remember: If you submit a file for the second time, it will overwrite the original file.

Additional references: