Kepler = Kubernetes-based Efficient Power Level Exporter
Kepler (Kubernetes-based Efficient Power Level Exporter) is a Prometheus exporter. It uses eBPF to probe CPU performance counters and Linux kernel tracepoints.
These data and stats from cgroup and sysfs can then be fed into ML models to estimate energy consumption by Pods.
Measuring energy consumption is an important step for teams to reduce the carbon footprint of their software. Cloud Carbon Footprint (CCF) estimates energy based on billing and usage data retrieved from cloud APIs. Kepler — short for Kubernetes-based Efficient Power Level Exporter — goes one step further: it uses software counters via RAPL, ACPI and nvml to measure power consumption by hardware resources and employs an eBPF-based approach to attribute power consumption to processes, containers and Kubernetes pods.
Baremetal with x86 architecture and no power meter:
Node Total Power: Power Estimation
Node Component Powers Measurement (RAPL)
Pod Power: Power Ratio
Per Process/Container Measurement
eBPF is utilized to extract process-related resource utilization metrics, by default CPU instructions are used (config)
Ratio Power Model: calculates the ratio of a process’s resource utilization to the entire system’s resource utilization and then multiplying this ratio by the dynamic power consumption of a resource
Power consumption of processes is aggregated into containers and Kubernetes Pods levels
Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability ⤴
The GSF’s SCI standard specifies that energy consumption should encompass all energy consumed by reserved or provisioned hardware, not solely the hardware used while an application is running. This includes idle power, encompassing not only static power but also the energy consumed by items necessary to be functional for the application to run such as the kubelet or the control plane.
Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability ⤴
According to theSPECpower benchmark, active idle power consumption can vary widely, ranging from 20% to 60% of the power consumed at maximum utilization. In absolute terms, this ranges from 100Watts to 1300Watts, equivalent to the power consumption of a classic light bulb or an air conditioner in a household (although there are differences in carbon emissions). The idle power can be even higher when considering GPU power consumption.
Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability ⤴
The differentiation between idle and dynamic power has been widely investigated in the research literature[1].
Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability ⤴
It is crucial to isolate idle power from dynamic power for reasons including fair attribution of power consumption to applications and to help create more accurate power models.
Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability ⤴
Dynamic energy consumption is directly correlated to resource use. On the other hand, idle power is the base power consumption that occurs regardless of resource utilization. A power model that uses resource utilization to estimate power consumption will only accurately reflect dynamic power. Idle power, is a constant power consumption that must be attributed to all running applications.
Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability ⤴
.
Suppose we have a server with a linear power model for a 16-core server X, described by a simple formula as simple as below where Idle=200 and Coeff=8:
Power Consumption (P) = Idle Power (Idle) + Dynamic Power Consuming Factor (Coeff) * Resource Utilization (x)
Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability ⤴
In the third scenario, isolating idle power enhances predictability and clarity. This method divides constant power (unaffected by resource use) and defines dynamic power solely based on resource utilization. Specifically, according to the power model, Client A’s dynamic power is 80W. By allocating idle power based on resource ratio and absolute power, Client A receives 40W of idle power, while Client B gets 160W. and absolute power, Client A receives only 40W of idle power, leaving Client B with 160W of idle power.
These unequal distributions of idle power highlight unfairness because idle power consumption isn’t tied to resource utilization. Applying the power consumption ratio based on resource requests doesn’t accurately reflect actual dynamic power consumption corresponding to real resource utilization.
Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability ⤴
Estimating idle power in virtual private clouds is challenging due to unknown physical specs and the number of tenants (VMs). To address this, we’ve developed a training framework focused on extracting and training the dynamic power model alone. This approach reduces dependency on unknown factors and enhances energy-efficient optimization for virtual machines[4].
Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability ⤴
Complete waste accounting at the user level remains a challenge, necessitating further research and collaboration within the industry. Transparent, fine-grained energy metrics for public cloud efficiency remains important, and needs to have challenges addressed such as idle power attribution and estimation. Tools like Kepler pave the way for sustainable cloud computing practices.
If Kepler doesn't have access to interfaces like RAPL, it uses model-based estimation approach.
In the most minimal deployment, a Local Linear Regression Estimator is used that uses usage metrics to estimate power consumption. No special configuration or extra deployment is required.
If you want to get a model with better estimation accuracy, Kepler may connect to a remote Kepler Model Server.
Amaral, M., Chen, H., Chiba, T., Nakazawa, R., Choochotkaew, S., Lee, E. K., & Eilam, T. (2023). Kepler: A Framework to Calculate the Energy Consumption of Containerized Applications. 2023 IEEE 16th International Conference on Cloud Computing (CLOUD), 69–71. https://doi.org/10.1109/CLOUD60044.2023.00017
Choochotkaew, S., Wang, C., Chen, H., Chiba, T., Amaral, M., Lee, E. K., & Eilam, T. (2023). Advancing Cloud Sustainability: A Versatile Framework for Container Power Model Training. 2023 31st International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), 1–4. https://doi.org/10.1109/MASCOTS59514.2023.10387542