Kubernetes Cluster Autoscaling(CA)using AWS EKS.

Harshal Kathar

5 min readJun 7, 2020

Kubernetes has 3 types of autoscaling groups.

Cluster Autoscaling (CA).
Horizontal Pod Autoscaling(HPA).
Vertical Pod Autoscaling(VPA).

Here HPA and VPA work at a pod level or application level, where CA is work at the Infrastructure level.

Cluster Autoscaling

Cluster Autoscaling is a Kubernetes tool that increases or decreases the size of a Kubernetes cluster (by adding or removing nodes), based on the presence of pending pods and node utilization metrics.

2. Cluster Autoscaler configured with the EC2 Autoscaling group.

3. The Cluster Autoscaler modifies your worker node groups so that they scale out when you need more resources and scale in when you have underutilized resources.

Follow the below steps to launch cluster autoscaling in your EKS.

Create an AWS EKS

*Go to my link where I explain in deep how to launch an AWS EKS.

Deploying a Kubernetes Cluster with Amazon EKS

AWS EKS allows you to create your own Kubernetes clusters in the AWS cloud very quickly and easily.

medium.com

Create an IAM Policy For Worker Node

1. After the creation of EKS, The Cluster Autoscaler requires the following IAM permissions to make calls to AWS APIs on your behalf.

2. Go to IAM Console -> Select Roles -> Select the Worker node role.

3. Click on Add inline policy, and make a Custom policy with the following policy.

IAM Policy.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "autoscaling:DescribeAutoScalingGroups",
                "autoscaling:DescribeAutoScalingInstances",
                "autoscaling:DescribeLaunchConfigurations",
                "autoscaling:DescribeTags",
                "autoscaling:SetDesiredCapacity",
                "autoscaling:TerminateInstanceInAutoScalingGroup",
                "ec2:DescribeLaunchTemplateVersions"
            ],
            "Resource": "*"
        }
    ]
}

4. Paste this policy and create a custom policy.

Worker node ASG

1. Go to the EC2 Autoscaling group, you see there an autoscaling group of the EKS.

2. Check the min and max configuration of your worker nodes.

3.if your max capacity=2 and you already launch a 2 worker node, the Cluster Autoscaler not Spain the new node when the load is increased.

4. So change the Min and Max configuration.

E.X: Min=2 , Max= 5.

Deploy the Cluster Autoscaler

1. Deploy the Cluster Autoscaler to your cluster with the following command.

kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml

Output:

2. Add the cluster-autoscaler.kubernetes.io/safe-to-evict annotation to the deployment with the following command.

kubectl -n kube-system annotate deployment.apps/cluster-autoscaler cluster-autoscaler.kubernetes.io/safe-to-evict="false"

Output:

3. Edit the Cluster Autoscaler deployment with the following command.

kubectl -n kube-system edit deployment.apps/cluster-autoscaler

Edit the cluster-autoscaler container command to replace <YOUR CLUSTER NAME> with your cluster's name, and add the following options.

--balance-similar-node-groups
--skip-nodes-with-system-pods=false

Save and close the file to apply the changes.

spec:
      containers:
      - command:
        - ./cluster-autoscaler
        - --v=4
        - --stderrthreshold=info
        - --cloud-provider=aws
        - --skip-nodes-with-local-storage=false
        - --expander=least-waste
        - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/<YOUR CLUSTER NAME>
        - --balance-similar-node-groups
        - --skip-nodes-with-system-pods=false

*See the image you get an idea, how to edit the autoscaler deployment.

4. Set the Cluster Autoscaler image tag to the version that you recorded in the previous step with the following command. Replace 1.15.n with your own value. You can replace us with asia or eu.

kubectl -n kube-system set image deployment.apps/cluster-autoscaler cluster-autoscaler=us.gcr.io/k8s-artifacts-prod/autoscaling/cluster-autoscaler:v1.15.n

output:

find the latest Cluster Autoscaler version that matches your cluster’s Kubernetes major and minor version follow the below link.

kubernetes/autoscaler

Create your free GitHub account today to subscribe to this repository for new releases and build software alongside 50…

github.com

View your Cluster Autoscaler logs

View your Cluster Autoscaler logs with the following command.

kubectl -n kube-system logs -f deployment.apps/cluster-autoscaler

Output:

You see calculating unneeded nodes , but if something is not right then you see doesn’t have the access that kind of message.

Checking Cluster Autoscaler Working

Deploy a 1 sample application.

vim deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: php-apache
  labels:
    app: php-apache
spec:
  replicas: 20
  selector:
    matchLabels:
      app: php-apache
  template:
    metadata:
      labels:
        app: php-apache
    spec:
      containers:
      - name: php-apache
        image: k8s.gcr.io/hpa-example
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 500m
            memory: 256Mi
          limits:
            cpu: 1000m
            memory: 512Mi

2. Deployed the application using the following command.

kubectl create -f <filename.yaml>

3. Here Cluster Autoscaler just looks at the requester and sees if the requester CPU goes above the available node CPU and if it’s just scaling the node.

In this example, we need a 10v CPU.

4. So after the deployment cluster autoscaler check the node every 10s, an if it finds that node is insufficient to handle the request and start making new nodes using EC2 autoscaling Group.

for your reference I attached the images, here you see when I create a deployment which as requested more CPU, the EC2 autoscaling group starts making new nodes to handle this request.