To appear at (HIPS'01), San Francisco, California, USA, April 23, 2001
Abstract
One of the major challenges facing high performance computing is the
daunting task of producing programs that will achieve acceptable
levels of performance when run on parallel architectures. Although
many organizations have been actively working in this area for 5-10
years (or longer), many programs have yet to be parallelized.
Furthermore, some programs that were parallelized were done so for
obsolete systems (e.g., SIMD computers from Thinking Machines and
MASPAR), and these programs run poorly, if at all, on the current
generation of parallel computers. Therefore, a straight forward
approach to parallelizing vectorizable codes is needed without
introducing any changes to the algorithm or the convergence
properties of the codes. Using the combination of loop-level
parallelism, and RISC-based shared memory SMPs has proven to be
a successful approach to solving this problem.