RAPL in Action: Experiences in Using RAPL for Power Measurements

Status:: 🟩
Links:: RAPL

Metadata

Authors:: Khan, Kashif Nizam; Hirki, Mikael; Niemi, Tapio; Nurminen, Jukka K.; Ou, Zhonghong
Title:: RAPL in Action: Experiences in Using RAPL for Power Measurements
Publication Title:: "ACM Transactions on Modeling and Performance Evaluation of Computing Systems"
Date:: 2018
URL:: https://dl.acm.org/doi/10.1145/3177754
DOI:: 10.1145/3177754

Bibliography

Khan, K. N., Hirki, M., Niemi, T., Nurminen, J. K., & Ou, Z. (2018). RAPL in Action: Experiences in Using RAPL for Power Measurements. ACM Transactions on Modeling and Performance Evaluation of Computing Systems, 3(2), 1–26. https://doi.org/10.1145/3177754

Zotero

Type:: #zotero/journalArticle
Zotero::

Keywords:: [Measurement, πŸ’Ž, RAPL]

Relations

Abstract

To improve energy efficiency and comply with the power budgets, it is important to be able to measure the power consumption of cloud computing servers. Intel’s Running Average Power Limit (RAPL) interface is a powerful tool for this purpose. RAPL provides power limiting features and accurate energy readings for CPUs and DRAM, which are easily accessible through different interfaces on large distributed computing systems. Since its introduction, RAPL has been used extensively in power measurement and modeling. However, the advantages and disadvantages of RAPL have not been well investigated yet. To fill this gap, we conduct a series of experiments to disclose the underlying strengths and weaknesses of the RAPL interface by using both customized microbenchmarks and three well-known application level benchmarks: Stream, Stress-ng, and ParFullCMS. Moreover, to make the analysis as realistic as possible, we leverage two production-level power measurement datasets from the Taito, a supercomputing cluster of the Finnish Center of Scientific Computing and also replicate our experiments on Amazon EC2. Our results illustrate different aspects of RAPL and document the findings through comprehensive analysis. Our observations reveal that RAPL readings are highly correlated with plug power, promisingly accurate enough, and have negligible performance overhead. Experimental results suggest RAPL can be a very useful tool to measure and monitor the energy consumption of servers without deploying any complex power meters. We also show that there are still some open issues, such as driver support, non-atomicity of register updates, and unpredictable timings that might weaken the usability of RAPL in certain scenarios. For such scenarios, we pinpoint solutions and workarounds.

Notes & Annotations

🟨 Note (last modified: 2023-08-22#16:41:53) (

Paper notes by David Mytton

https://davidmytton.blog/paper-notes-rapl-in-action/

Since then,Β other researchersΒ have found that the newer EC2 KVM hypervisor instances no longer provide access to the RAPL metrics. This is not a bad thing because the limitations above mean the results are probably not that useful. Using bare metal cloud instances seems like the only way to get access to the RAPL data now.

Although we do now have carbon calculators for Amazon, Google, and Microsoft cloud environments, these are abstracted away from the underlying energy data. This makes it difficult to optimize cloud applications for energy efficiency.


πŸ“‘ Annotations (imported on 2023-06-21#09:37:08)

khan.etal.2018.raplactionexperiences (pg. 1)

To improve energy efficiency and comply with the power budgets, it is important to be able to measure the power consumption of cloud computing servers. Intel’s Running Average Power Limit (RAPL) interface is a powerful tool for this purpose. RAPL provides power limiting features and accurate energy readings for CPUs and DRAM, which are easily accessible through different interfaces on large distributed computing systems. Since its introduction, RAPL has been used extensively in power measurement and modeling.

πŸ“‘ Annotations (imported on 2023-06-21#09:41:29)

khan.etal.2018.raplactionexperiences (pg. 24)

Our overall study suggests that RAPL has evolved toward a better energy measurement tool since its introduction in Sandybridge, and it has appeared to be a useful and efficient alternative for manually instrumented complex power monitors. With the Haswell architecture, RAPL has improved considerably, its power readings now closely match plug power readings and it has now introduced the new measurement domain PSys and improved the power performance in Skylake.

πŸ“‘ Annotations (imported on 2023-08-22#16:42:03)

khan.etal.2018.raplactionexperiences (pg. 2)

There are also tools like the Intelligent Platform Management Interface (IPMI), which reports the power measurement readings through sensors mounted with the system. Existing studies [21] find that the accuracy of such sensors is not promising and therefore these sensors cannot be practically used as a substitute for more accurate watt meters on a per-machine basis.

khan.etal.2018.raplactionexperiences (pg. 15)

The powercap driver allows reading the energy counters and configuring the power consumption limits. The counters can be found under /sys/devices/power/events/. They can be read without root access. The powercap driver was introduced in the kernel version 3.13.