Kubernetes Auto-Scaling
This note is about horizontal autoscaling of workloads with Kubernetes.
Autoscalers are categorized into pod and cluster autoscalers.
Overview:
Pod Auto-Scaling
Overview:
Vertical Pod Autoscaler
Vertical Pod Autoscaler (VPA) frees users from the necessity of setting up-to-date resource limits and requests for the containers in their pods. When configured, it will set the requests automatically based on usage and thus allow proper scheduling onto nodes so that appropriate resource amount is available for each pod. It will also maintain ratios between limits and requests that were specified in initial containers configuration.
Source code: https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler
Goldilocks
https://goldilocks.docs.fairwinds.com/
Goldilocks provides a dashboard that displays the recommendations of the VPA.
Horizontal Pod Autoscaler
Horizontal Pod Autoscaler is the default pod autoscaling approach with Kubernetes. It uses the memory or CPU usage as the metric to decide, if a pod should be scaled out or in.
Docs: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
Workshop: https://www.eksworkshop.com/docs/autoscaling/workloads/horizontal-pod-autoscaler/
Default sync interval is 15 sec (can be configured).
Needs the Kubernetes Metrics Server or, if you need other metrics or are already using Prometheus, you can use e.g. the Prometheus Adapter for Kubernetes Metrics APIs to be able to connect to Prometheus (other adapters exist as well).
Using the Prometheus-Adapter:
- Helm Chart
- Using the Kubernetes Horizontal Pod Autoscaler with Prometheusβ Custom Metrics - webscale.com
- Scaling Kubernetes Pods using Prometheus Metrics | Dustin Specker
- How to Implement Kubernetes Autoscaling Using Prometheus
KEDA
Website: https://keda.sh/
GitHub: https://github.com/kedacore/keda
https://github.com/aws-samples/amazon-eks-scaling-with-keda-and-karpenter
https://dev.to/cdennig/horizontal-autoscaling-in-kubernetes-3-keda-24l6
https://sysdig.com/blog/kubernetes-hpa-prometheus/
Cluster Auto-Scaling
Cluster Autoscaler
Cluster Autoscaler is a tool that automatically adjusts the size of the Kubernetes cluster when one of the following conditions is true:
- there are pods that failed to run in the cluster due to insufficient resources.
- there are nodes in the cluster that have been underutilized for an extended period of time and their pods can be placed on other existing nodes.
Source code: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
Docs: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md
Best Practices: https://aws.github.io/aws-eks-best-practices/cluster-autoscaling/
Terraform:
- https://registry.terraform.io/modules/lablabs/eks-cluster-autoscaler/aws/latest
- https://hands-on.cloud/eks-terraform-cluster-deployment-guide/
Karpenter
Better cluster-autoscaler.
Karpenter observes the aggregate resource requests of unscheduled pods and makes decisions to launch and terminate nodes to minimize scheduling latencies and infrastructure cost.
Website: https://karpenter.sh/
Source code: https://github.com/aws/karpenter-provider-aws
Best Practices: https://aws.github.io/aws-eks-best-practices/karpenter/
Workshop: https://www.eksworkshop.com/docs/autoscaling/compute/karpenter/
Enhancing Vertical Autoscaling
VASIM
Enhanced autoscaling with VASIM: Vertical Autoscaling Simulator Toolkit - Microsoft Research
Known by: LinkedIn-Post by Wilco Burggraaf
π References
Zilch, M. (2022). Evaluation of explainability in autoscaling frameworks [Master thesis]. https://doi.org/10.18419/opus-12574