Kepler

Purpose:: Analyse Kubernetes power consumption and visualize it
Type:: #tool/dev

Website:: https://sustainable-computing.io/
Docs:: https://sustainable-computing.io/installation/kepler/
Source Code:: https://github.com/sustainable-computing-io/kepler
Community:: https://github.com/sustainable-computing-io/kepler/discussions

Description

Kepler = Kubernetes-based Efficient Power Level Exporter

Kepler (Kubernetes-based Efficient Power Level Exporter) is a Prometheus exporter. It uses eBPF to probe CPU performance counters and Linux kernel tracepoints.

These data and stats from cgroup and sysfs can then be fed into ML models to estimate energy consumption by Pods.

Technology Radar (ThoughtWorks)

Assess as of April 2023

Measuring energy consumption is an important step for teams to reduce the carbon footprint of their software. Cloud Carbon Footprint (CCF) estimates energy based on billing and usage data retrieved from cloud APIs. Kepler — short for Kubernetes-based Efficient Power Level Exporter — goes one step further: it uses software counters via RAPL, ACPI and nvml to measure power consumption by hardware resources and employs an eBPF-based approach to attribute power consumption to processes, containers and Kubernetes pods.

Kepler | Technology Radar | Thoughtworks

Methodology

Collecting System Power Consumption

Kepler-1703088961600.jpeg

Source: https://www.cncf.io/blog/2023/10/11/exploring-keplers-potentials-unveiling-cloud-application-power-consumption/

Data sources on bare-metal (Kepler Energy Sources):

There are 3 model approaches:

Baremetal with x86 architecture and no power meter:

Per Process/Container Measurement

More information:

See also Measure energy consumption of software per process.

Isolating Idle Power

Sunyanan Choochotkaew, Marcelo Amaral, Huamin Chen - – Idle Power Matters- Kepler Metrics for Pub...

Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability

Meta

@CNCF TAG Environmental Sustainability
@Omnivore

Highlights

Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability

The GSF’s SCI standard specifies that energy consumption should encompass all energy consumed by reserved or provisioned hardware, not solely the hardware used while an application is running. This includes idle power, encompassing not only static power but also the energy consumed by items necessary to be functional for the application to run such as the kubelet or the control plane.

Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability

According to theSPECpower benchmark, active idle power consumption can vary widely, ranging from 20% to 60% of the power consumed at maximum utilization. In absolute terms, this ranges from 100Watts to 1300Watts, equivalent to the power consumption of a classic light bulb or an air conditioner in a household (although there are differences in carbon emissions). The idle power can be even higher when considering GPU power consumption.

Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability

The differentiation between idle and dynamic power has been widely investigated in the research literature[1].

Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability

It is crucial to isolate idle power from dynamic power for reasons including fair attribution of power consumption to applications and to help create more accurate power models.

Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability

Dynamic energy consumption is directly correlated to resource use. On the other hand, idle power is the base power consumption that occurs regardless of resource utilization. A power model that uses resource utilization to estimate power consumption will only accurately reflect dynamic power. Idle power, is a constant power consumption that must be attributed to all running applications.

Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability

.

Example of Power Attribution in Multi-tenant Environments

Suppose we have a server with a linear power model for a 16-core server X, described by a simple formula as simple as below where Idle=200 and Coeff=8:

Power Consumption (P) = Idle Power (Idle) + Dynamic Power Consuming Factor (Coeff) * Resource Utilization (x)

Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability

In the third scenario, isolating idle power enhances predictability and clarity. This method divides constant power (unaffected by resource use) and defines dynamic power solely based on resource utilization. Specifically, according to the power model, Client A’s dynamic power is 80W. By allocating idle power based on resource ratio and absolute power, Client A receives 40W of idle power, while Client B gets 160W. and absolute power, Client A receives only 40W of idle power, leaving Client B with 160W of idle power.

These unequal distributions of idle power highlight unfairness because idle power consumption isn’t tied to resource utilization. Applying the power consumption ratio based on resource requests doesn’t accurately reflect actual dynamic power consumption corresponding to real resource utilization.

Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability

Estimating idle power in virtual private clouds is challenging due to unknown physical specs and the number of tenants (VMs). To address this, we’ve developed a training framework focused on extracting and training the dynamic power model alone. This approach reduces dependency on unknown factors and enhances energy-efficient optimization for virtual machines[4].

Idle Power Matters: Kepler Metrics for Public Cloud Energy Efficiency | CNCF TAG Environmental Sustainability

Complete waste accounting at the user level remains a challenge, necessitating further research and collaboration within the industry. Transparent, fine-grained energy metrics for public cloud efficiency remains important, and needs to have challenges addressed such as idle power attribution and estimation. Tools like Kepler pave the way for sustainable cloud computing practices.

Installation

Using Helm

GitHub - sustainable-computing-io/kepler-helm-chart

Add helm repo:

helm repo add kepler https://sustainable-computing-io.github.io/kepler-helm-chart

Install:

helm install kepler kepler/kepler --namespace kepler --create-namespace

Uninstall:

helm delete kepler --namespace kepler

Estimation

If Kepler doesn't have access to interfaces like RAPL, it uses model-based estimation approach.

In the most minimal deployment, a Local Linear Regression Estimator is used that uses usage metrics to estimate power consumption. No special configuration or extra deployment is required.

If you want to get a model with better estimation accuracy, Kepler may connect to a remote Kepler Model Server.

More Information: Kepler Power Estimation Deployment

Models

Model Version 0.7: https://github.com/sustainable-computing-io/kepler-model-db/tree/main/models/v0.7

Model Server

GitHub - sustainable-computing-io/kepler-model-server: Model Server for Kepler

🔗 References

Articles:

Podcasts:

Research Papers: