Kubernetes Auto-Scaling

Info

This note is about horizontal autoscaling of workloads with Kubernetes.

Autoscalers are categorized into pod and cluster autoscalers.

Overview:

https://www.eksworkshop.com/docs/autoscaling/

Pod Auto-Scaling

Overview:

Kubernetes autoscaling patterns: HPA, VPA and KEDA - Spectro Cloud

Vertical Pod Autoscaler

Vertical Pod Autoscaler (VPA) frees users from the necessity of setting up-to-date resource limits and requests for the containers in their pods. When configured, it will set the requests automatically based on usage and thus allow proper scheduling onto nodes so that appropriate resource amount is available for each pod. It will also maintain ratios between limits and requests that were specified in initial containers configuration.

Source code: https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler

Goldilocks

https://goldilocks.docs.fairwinds.com/

Goldilocks provides a dashboard that displays the recommendations of the VPA.

Horizontal Pod Autoscaler

Horizontal Pod Autoscaler is the default pod autoscaling approach with Kubernetes. It uses the memory or CPU usage as the metric to decide, if a pod should be scaled out or in.

Docs: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

Workshop: https://www.eksworkshop.com/docs/autoscaling/workloads/horizontal-pod-autoscaler/

Default sync interval is 15 sec (can be configured).

Needs the Kubernetes Metrics Server or, if you need other metrics or are already using Prometheus, you can use e.g. the Prometheus Adapter for Kubernetes Metrics APIs to be able to connect to Prometheus (other adapters exist as well).

Using the Prometheus-Adapter:

KEDA

Website: https://keda.sh/
GitHub: https://github.com/kedacore/keda

https://github.com/aws-samples/amazon-eks-scaling-with-keda-and-karpenter
https://dev.to/cdennig/horizontal-autoscaling-in-kubernetes-3-keda-24l6
https://sysdig.com/blog/kubernetes-hpa-prometheus/

Cluster Auto-Scaling

Autoscaling - Amazon EKS

Cluster Autoscaler

Cluster Autoscaler is a tool that automatically adjusts the size of the Kubernetes cluster when one of the following conditions is true:

there are pods that failed to run in the cluster due to insufficient resources.
there are nodes in the cluster that have been underutilized for an extended period of time and their pods can be placed on other existing nodes.

Source code: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
Docs: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md

Best Practices: https://aws.github.io/aws-eks-best-practices/cluster-autoscaling/

Terraform:

Karpenter

Better cluster-autoscaler.
Karpenter observes the aggregate resource requests of unscheduled pods and makes decisions to launch and terminate nodes to minimize scheduling latencies and infrastructure cost.

Website: https://karpenter.sh/
Source code: https://github.com/aws/karpenter-provider-aws
Best Practices: https://aws.github.io/aws-eks-best-practices/karpenter/
Workshop: https://www.eksworkshop.com/docs/autoscaling/compute/karpenter/

Enhancing Vertical Autoscaling

VASIM

Enhanced autoscaling with VASIM: Vertical Autoscaling Simulator Toolkit - Microsoft Research

Known by: LinkedIn-Post by Wilco Burggraaf

🔗 References

Zilch, M. (2022). Evaluation of explainability in autoscaling frameworks [Master thesis]. https://doi.org/10.18419/opus-12574