A Comparative Study of Methods for Measurement of Energy of Computing

Status:: 🟩
Links:: RAPL | Measure energy consumption of software

Metadata

Authors:: Fahad, Muhammad; Shahid, Arsalan; Manumachu, Ravi Reddy; Lastovetsky, Alexey
Title:: A Comparative Study of Methods for Measurement of Energy of Computing
Publication Title:: "Energies"
Date:: 2019
URL:: https://www.mdpi.com/1996-1073/12/11/2204
DOI:: 10.3390/en12112204

Bibliography

Fahad, M., Shahid, A., Manumachu, R. R., & Lastovetsky, A. (2019). A Comparative Study of Methods for Measurement of Energy of Computing. Energies, 12(11), 2204. https://doi.org/10.3390/en12112204

Zotero

Type:: #zotero/journalArticle
Keywords:: [🔥, ⏳, Measurement, Green Software, Energy Efficiency, RAPL]

Relations

Abstract

Energy of computing is a serious environmental concern and mitigating it is an important technological challenge. Accurate measurement of energy consumption during an application execution is key to application-level energy minimization techniques. There are three popular approaches to providing it: (a) System-level physical measurements using external power meters; (b) Measurements using on-chip power sensors and (c) Energy predictive models. In this work, we present a comprehensive study comparing the accuracy of state-of-the-art on-chip power sensors and energy predictive models against system-level physical measurements using external power meters, which we consider to be the ground truth. We show that the average error of the dynamic energy profiles obtained using on-chip power sensors can be as high as 73% and the maximum reaches 300% for two scientific applications, matrix-matrix multiplication and 2D fast Fourier transform for a wide range of problem sizes. The applications are executed on three modern Intel multicore CPUs, two Nvidia GPUs and an Intel Xeon Phi accelerator. The average error of the energy predictive models employing performance monitoring counters (PMCs) as predictor variables can be as high as 32% and the maximum reaches 100% for a diverse set of seventeen benchmarks executed on two Intel multicore CPUs (one Haswell and the other Skylake). We also demonstrate that using inaccurate energy measurements provided by on-chip sensors for dynamic energy optimization can result in significant energy losses up to 84%. We show that, owing to the nature of the deviations of the energy measurements provided by on-chip sensors from the ground truth, calibration can not improve the accuracy of the on-chip sensors to an extent that can allow them to be used in optimization of applications for dynamic energy. Finally, we present the lessons learned, our recommendations for the use of on-chip sensors and energy predictive models and future directions.

Notes & Annotations

📑 Annotations (imported on 2023-07-28#12:22:50)

fahad.etal.2019.comparativestudymethods (pg. 1)

In this work, we present a comprehensive study comparing the accuracy of state-of-the-art on-chip power sensors and energy predictive models against system-level physical measurements using external power meters, which we consider to be the ground truth.

fahad.etal.2019.comparativestudymethods (pg. 2)

Accurate measurement of energy consumption during an application execution is key to energy minimization techniques at software level. There are three popular approaches to providing it: (a) System-level physical measurements using external power meters, (b) Measurements using on-chip power sensors and (c) Energy predictive models.

fahad.etal.2019.comparativestudymethods (pg. 2)

While the first approach is known to be accurate [5], it can only provide the measurement at a computer level and therefore lacks the ability to provide fine-grained device-level decomposition of the energy consumption of an application executing on several independent computing devices in a computer.

fahad.etal.2019.comparativestudymethods (pg. 2)

The second approach is based on on-chip power sensors now provided in mainstream processors such as Intel and AMD Multicore CPUs, Nvidia GPUs and Intel Xeon Phis. Intel CPUs offer Running Average Power Limit (RAPL) [6] to monitor power and control frequency (and voltage). RAPL is based on a software model using performance monitoring counters (PMCs) as predictor variables to measure energy consumption for CPUs and DRAM for processor generations preceding Haswell such as Sandybridge and Ivybridge E5 [7].

fahad.etal.2019.comparativestudymethods (pg. 2)

The third approach is based on software energy predictive models, which emerged as a popular alternative to determine the energy consumption of an application. A vast majority of such models is linear and uses performance monitoring counters (PMCs) as predictor variables. While the models provide fine-grained component-level energy consumption during the execution of the application, there are research works highlighting their poor accuracy [15–18].

fahad.etal.2019.comparativestudymethods (pg. 3)

fahad.etal.2019.comparativestudymethods (pg. 4)

In this work, we consider only the dynamic energy consumption. We describe the rationale behind using dynamic energy consumption in the Appendix B. It is calculated using the following formula:

$E_{D} = E_{T} - (P_{S} \times T_{E})$ (1)

where $E_{T}$ is the total energy consumption of the platform during the execution of an application and $T_{E}$ is the execution time of the application. $P_{S}$ is the static power consumption of the platform, which is the power consumption of the platform when it is idle.

fahad.etal.2019.comparativestudymethods (pg. 5)

State-of-the-art on-chip power sensors (RAPL for CPUs, NVML for GPUs, MPSS for Xeon Phis) provide power measurements at a high sampling frequency that can be obtained programmatically. The dynamic energy consumption during an application execution on a compute device equipped with on-chip sensors is also calculated using the Formula (1). The execution time $T_{E}$ of the application execution can be determined accurately using the timers provided in the compute device. The base power consumption $P_{S}$ is obtained using the on-chip sensors when the component is idle. The total energy consumption $E_{T}$ is calculated from the power samples using the trapezoidal rule.

fahad.etal.2019.comparativestudymethods (pg. 6)

Basmadjian et al. [29] model power consumption of a server as sum of power consumption of its components, the processor (CPU), memory (RAM), fans and disk (HDD) Bircher et al. [30] present an power predictive model based on PMCs that capture interdependence between subsystems such as CPU, disk, GPU and so forth.

fahad.etal.2019.comparativestudymethods (pg. 6)

Rotem et al. [6] present Running Average Power Limit RAPL, a software power model for CPU based architectures released in Intel Sandybridge. This model predicts the energy consumption of core and uncore components based on an a undisclosed set of PMCs.

fahad.etal.2019.comparativestudymethods (pg. 23)

Based on our study, we can not recommend use of state-of-the-art on-chip sensors (RAPL for multicore CPUs, NVML for GPUs, MPSS for Xeon Phis) The fundamental issue with this measurement approach is the lack of information about how a power reading for a component is determined during the execution of an application utilizing the component.

fahad.etal.2019.comparativestudymethods (pg. 23)

At the same time, we observed that the energy measurements reported by the on-chip sensors are deterministic and reproducible and, therefore can be used as parameters in energy predictive models.

fahad.etal.2019.comparativestudymethods (pg. 23)

Energy predictive models based on PMCs are plagued by poor accuracy [15–18]. The sources of this inaccuracy are the following: (a) Model parameters in most cases are not deterministic and reproducible and (b) Model parameters are selected chiefly based on correlation with energy and not their physical significance originating from fundamental physical laws such as conservation of energy of computing.

fahad.etal.2019.comparativestudymethods (pg. 24)

We show that the average error between the dynamic energy profiles obtained using on-chip power sensors and the ground truth ranges from 8% and 73% and the maximum reaches 300%.