Auto-Tuned Per-Loop Compilation
- funded by: LLNL
- funding level: $50,000 + $21,202 (phase 2)
- duration: 01/24/2018 - 01/31/2019, phase 2 10/04/2018 - 8/30/2019
HPC applications require careful tuning to exploit close to peak
performance on cutting-edge hardware platforms. This work hypothesizes
that traditional per-module optimizations fall short of
fully exploiting a compiler's capabilities, even when
interprocedural optimization complement local and global ones.
This project proposes to investigate the viability to separately
compile major loops in an auto-tuning effort. Such an ensemble of
loop units, when linked together, has the potential to improve not
only single-loop but also overall application performance, thereby
edging closer to peak performance for a given platform.
"BarrierFinder: Recognizing Ad Hoc Barriers"
by Tao Wang, Xiao Yu, Zhengyi Qiu, Guoliang Jin,
Frank Mueller in Empirical Software Engineering (EMSE),
No. 9862, accepted Jul 2020.
- CodeSeer: Input-dependent Code Variants Selection Via Machine Learning
by Tao Wang, Nikhil Jain, David Boehme, David Beckingsale, Frank Mueller and Todd Gamblin
in International Conference on Supercomputing (ICS), Jun 2020.
"BarrierFinder: Recognizing Ad Hoc Barriers" by Tao Wang, Xiao Yu, Zhengyi Qiu, Guoliang Jin, Frank Mueller in International Conference
on Software Maintenance and Evolution (ICSME), Sep/Oct 2019.
"FuncyTuner: Auto-tuning Scientific Applications With Per-loop
Compilation" by Tao Wang, Nikhil Jain, David
Beckingsale, David Boehme, Frank
Mueller, Todd Gamblin in International Conference
on Parallel Processing (ICPP), Aug 2019, Best Paper Candidate.