Please use the ARC cluster for this assignment. All programs have to be written in C, translated with mpicc/gcc and turned in with a corresponding Makefile.
If you are having trouble with scripting and qsub, take a look at this simple implementation of using qsub, bash scripting, and ssh to launch programs on several nodes: simple_mpi.tar
Hints:
Run the same setup as HW1 #2. Compare your results.
mympirun -np 4 myrtt
Turn in mympirun (a script), mympi.c/mympi.h (module containing the subset of MPI functionality required) and myrtt.c (same as in HW1 but referencing mympi.h).
We will extend the methods of the last HW into two dimenions.
Download, extract, compile the code lake.tar
This program models the surface of a lake, where some pebbles have been thrown onto the surface. The program works as follows. In the spatial domain, a centralized finite difference is used to inform a zone of how to update itself using the information from its neighbors
The time domain does something similarly, but here using information from the previous two times
The program runs two versions of the algorithm, a CPU version, and a skeleton GPU version. Your task is
to fill in the GPU algorithm to solve the same problem.
Instructions
V0:
./lake {npoints} {npebbles} {end_time} {nthreads}npoints defines the grid size (npoints x npoints), npebbles is the number of pebbles that are generated in the program, end_time is the final time of the simulation, and nthreads will be used withe the GPU implementation.
The following runs on a grid of (128 x 128), with 5 pebbles, for 1.0 seconds, using 8 GPU threads (implemented later):
./lake 128 5 1.0 8 Running ./lake with (128 x 128) grid, until 1.000000, with 8 threads CPU took 0.294668 seconds GPU computation: 0.001568 msec GPU end-to-end: 0.000000 sec
You will download the output files
lake_i.dat lake_f.datalong with the gnuplot script heatmap.gnu to a machine that has gnuplot installed. Then, run
gnuplot heatmap.gnu
This will create the files lake_i.png(the initial configuration), lake_f.png(the final configuration) in the directory.
The program takes as an argument nthreads. This will be the number of threads per block used on the GPU. So, for instance, with nthreads=8, and a domain of grid points (npoints=128 x 128), you will create (npoints/nthreads)x(npoints/nthreads) = (16 x 16) blocks, with (8 x 8) threads on each block.
lake_f_0.dat //node 0 lake_f_1.dat //node 1 //ect.
Hints:
double *u_i0; //u^0 double *u_i1; //u^1These are passed to both the run_cpu and run_gpu routines; both routines should produce the same results.
Turn in README, lake.cu, lakegpu.cu, Makefile
Single Author info:
username FirstName MiddleInitial LastName
Group info:
username FirstName MiddleInitial LastName
username FirstName MiddleInitial LastName
username FirstName MiddleInitial LastName