PowerCap: HPC Power Modeling and Active Control
- funded by: LLNL
- funding level: $386,279
- duration: 10/25/2016 - 09/30/2019 (no-cost extension to 09/30/2021)
The overall objective of this work is to establish systematic support
for power considerations as a first-order objective for HPC
system. This includes all phases, i.e., planning, procurement,
provisioning, and operations. It also spans from prototypical system
software development across all layers to simulation-based modeling
when hardware is not available yet. We leverage any available power
knobs, such as DVFS, capping, gating. And we consider trends in
manufacturing, such as processor variations resulting in different power
profiles of multi-core packages, even if they originate from the same
die and have identical manufacturer specifications. Hence, processors
have a range of different power efficiencies.
Publications:
-
"Systemic Assessment of
Node Failures in HPC Production Platforms"
by A. Das, F. Mueller, B. Rountree, in
International Parallel and Distributed Processing Symposium (IPDPS), May 2021.
-
"Aarohi: Making Real-Time Node Failure Prediction Feasible"
by A. Das, F. Mueller, B. Rountree, in
International Parallel and Distributed Processing Symposium (IPDPS), May 2020.
-
"Uncore Power Scavenger: A
Runtime for Uncore Power Conservation on HPC Systems"
by Neha Gholkar, Frank Mueller, Barry Rountree,
in Supercomputing (SC), Nov 2019, pages.
- "Evaluating Burst
Buffer Placement in HPC Systems" by
Harsh Khetawat, Christopher Zimmer, Frank Mueller, Scott Atchley,
Sudharshan Vazhkudai, Misbah Mubarak in Cluster, Sep 2019, Best Paper Award.
- PShifter:
Feedback-based Dynamic Power Shifting within HPC Jobs for Performance by
Neha Gholkar, Frank Mueller, Barry Rountree in
High-Performance Parallel and Distributed Computing (HPDC), Jun
2018, pages 106-117.
- Evaluating Performance of Burst
Buffer Models for Real-World Application Workloads in HPC
Systems. Harsh Khetawat, Frank Mueller, Christopher Zimmer. Referred
Work-in-Progress at Joint International Workshop on Parallel Data
Storage & Data Intensive Scalable Computing Systems (PDSW-DISCS'17),
Nov 2017.
- Power Tuning HPC Jobs on Power-Constrained Systems
by Neha Gholkar, Frank Mueller, Barry Rountree
in International Conference on Parallel Architecture and
Compilation Techniques (PACT), Sep 2016.
-
"A Power-aware Cost Model
for HPC Procurement"
by Neha Gholkar, Frank Mueller, Workshop on
High-Performance, Power-Aware Computing (HPPAC), May 2016.
-
"Power Tuning for HPC Jobs under Manufacturing Variations" by Neha Gholkar, Frank Mueller, Barry Rountree
in TR 2016-2, Dept. of Computer Science, North Carolina State
University, Feb 2016.
Theses: