AWS

EKS

Cost optimization

Cloud cost optimization

Terraform

Optimizing AWS EKS Costs with Terraform: A Comprehensive Guide

by: Ashish Sharma

March 01, 2024

In today's cloud-driven landscape, optimizing costs while maintaining operational efficiency is paramount. Amazon Elastic Kubernetes Service (EKS) offers a powerful platform for container orchestration, but without proper management, costs can quickly escalate. In this guide, we'll explore how to leverage Terraform to optimize costs in AWS EKS by implementing various strategies, including the automation of Horizontal Pod Autoscaling (HPA) and Cluster Autoscaler.

Introduction to Cost Optimization in AWS EKS

Amazon EKS provides a managed Kubernetes service that allows you to run Kubernetes on AWS without needing to install, operate, and maintain your Kubernetes control plane. However, managing resources efficiently in EKS is crucial to avoid unnecessary costs. One effective way to optimize costs is by automating resource scaling based on demand, ensuring that you're only paying for the resources you need at any given time.

Automating Cluster Autoscaler with Terraform

Cluster Autoscaler automatically adjusts the size of your Kubernetes cluster based on the resource utilization of the worker nodes. By automatically scaling the cluster up or down, you can ensure that you have the right amount of compute capacity to handle your workload efficiently, thereby minimizing unnecessary costs.

main.tf

# Terraform code for Cluster Autoscaler configuration


resource "aws_ecr_repository" "my_repository" {
  name = "scaling-pods-script-${var.env}"
  tags = {
    Enviornment = var.env
  }
}

resource "aws_ecr_lifecycle_policy" "purge_policy" {
  repository = aws_ecr_repository.my_repository.name

  policy = jsonencode({
    rules = [
      {
        rulePriority = 1
        description  = "Keep last 30 images"
        selection = {
          tagStatus   = "any"
          countType   = "imageCountMoreThan"
          countNumber = 30
        }
        action = {
          type = "expire"
        }
      }
    ]
  })
}

resource "aws_iam_role" "eks_role" {
  name = "autoscale-pod-role-${var.env}"
  assume_role_policy = jsonencode({
    "Version" : "2012-10-17",
    "Statement" : [
      {
        "Effect" : "Allow",
        "Principal" : {
          "Federated" : "arn:aws:iam::${var.account_number}:oidc-provider/oidc.eks.ap-south-1.amazonaws.com/id/${var.oidc_provider_arn}"
        },
        "Action" : "sts:AssumeRoleWithWebIdentity",
        "Condition" : {
          "StringEquals" : {
            "oidc.eks.ap-south-1.amazonaws.com/id/${var.oidc_provider_arn}:sub" : "system:serviceaccount:kube-system:cronjob-service-account",
            "oidc.eks.ap-south-1.amazonaws.com/id/${var.oidc_provider_arn}:aud" : "sts.amazonaws.com"
          }
        }
      }
    ]
  })
}

resource "aws_iam_policy" "eks_policy" {
  name = "autoscale-pod-policy-${var.env}"
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid    = "VisualEditor0"
        Effect = "Allow"
        Action = [
          "s3:PutObject",
          "s3:GetObject",
          "autoscaling:Describe*",
          "autoscaling:Update*",
          "autoscaling:Get*",
          "eks:*",
        ]
        Resource = "*"
      }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "eks_policy_attachment" {
  role       = aws_iam_role.eks_role.name
  policy_arn = aws_iam_policy.eks_policy.arn
}

resource "kubernetes_service_account" "cronjob_service_account" {
  metadata {
    name      = "cronjob-service-account"
    namespace = "kube-system"
    annotations = {
      "eks.amazonaws.com/role-arn" = aws_iam_role.eks_role.arn
    }
  }
}

resource "kubernetes_cluster_role" "hpa_manager_role" {
  metadata {
    name = "hpa-manager-role"
  }

  rule {
    api_groups = ["autoscaling"]
    resources  = ["horizontalpodautoscalers"]
    verbs      = ["get", "list", "watch", "update", "patch"]
  }
}

resource "kubernetes_cluster_role_binding" "hpa_manager_role_binding" {
  metadata {
    name = "hpa-manager-role-binding"
  }

  subject {
    kind      = "ServiceAccount"
    name      = kubernetes_service_account.cronjob_service_account.metadata.0.name
    namespace = kubernetes_service_account.cronjob_service_account.metadata.0.namespace
  }

  role_ref {
    kind      = "ClusterRole"
    name      = kubernetes_cluster_role.hpa_manager_role.metadata.0.name
    api_group = "rbac.authorization.k8s.io"
  }
}

# Scale down min pods to 1 cron job
resource "kubernetes_cron_job_v1" "scale_down_cronjob" {
  metadata {
    name      = "scale-down-cronjob"
    namespace = "kube-system"
  }
  spec {
    concurrency_policy            = "Replace"
    failed_jobs_history_limit     = 5
    schedule                      = "30 12 * * *" # 6 PM IST
    starting_deadline_seconds     = 10
    successful_jobs_history_limit = 10
    job_template {
      metadata {}
      spec {
        backoff_limit              = 2
        ttl_seconds_after_finished = 10
        template {
          metadata {
            labels = {
              app = "scale-down-cronjob"
            }
          }
          spec {
          service_account_name = kubernetes_service_account.cronjob_service_account.metadata.0.name
          container {
              name    = "scale-down-pod"
              image   = "${var.account_number}.dkr.ecr.ap-south-1.amazonaws.com/scaling-pods-script-${var.env}:latest"
              command = ["python", "script.py", "scale_down_to_one_pod", "${var.namespace}", "${var.env}"]
            }
          }
        }
      }
    }
  }
}



# Scale to pods to default cron job
resource "kubernetes_cron_job_v1" "scale_to_default" {
  metadata {
    name      = "scale-to-default-cronjob"
    namespace = "kube-system"
  }
  spec {
    concurrency_policy            = "Replace"
    failed_jobs_history_limit     = 5
    schedule                      = "30 2 * * *" # 8 AM IST
    starting_deadline_seconds     = 10
    successful_jobs_history_limit = 10
    job_template {
      metadata {}
      spec {
        backoff_limit              = 2
        ttl_seconds_after_finished = 10
        template {
          metadata {
            labels = {
              app = "scale-to-default-cronjob"
            }
          }
          spec {
          service_account_name = kubernetes_service_account.cronjob_service_account.metadata.0.name
          container {
              name    = "scale-to-default-pod"
              image   = "${var.account_number}.dkr.ecr.ap-south-1.amazonaws.com/scaling-pods-script-${var.env}:latest"
              command = ["python", "script.py", "scale_to_default", "${var.namespace}", "${var.env}"]
            }
          }
        }
      }
    }
  }
}

variables.tf

# Terraform code for Cluster Autoscaler configuration


variable "env" {
  default = "dev"
}

variable "account_number" {
  default = ""
}

variable "eks_id" {
  default = ""
}

In the provided Terraform code snippet, we define the necessary IAM roles and policies for Cluster Autoscaler, as well as the Kubernetes resources required for its deployment. This includes configuring the Cluster Autoscaler deployment with the appropriate settings for interacting with the AWS Auto Scaling Group.

Automating Horizontal Pod Autoscaling (HPA) with Terraform

Horizontal Pod Autoscaling (HPA) automatically adjusts the number of pod replicas in a Kubernetes deployment based on observed CPU or memory utilization. By implementing HPA, you can ensure that your application scales seamlessly in response to changing workloads, thereby optimizing resource utilization and reducing costs.

# Code for HPA configuration

{{- if .Values.autoscaling.enabled -}}
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: {{.Values.app}}
  namespace: {{.Values.namespace}}
spec:
  maxReplicas: {{ .Values.autoscaling.maxReplicas }}
  minReplicas: {{ .Values.autoscaling.minReplicas }}
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {{.Values.app}}
  targetCPUUtilizationPercentage: {{ .Values.autoscaling.targetCPUUtilizationPercentage }}
{{- end }}

autoscaling:
  enabled: true
  minReplicas: 1
  maxReplicas: 10
  targetCPUUtilizationPercentage: 80

In the provided Terraform code snippet, we define the necessary resources for configuring HPA in Kubernetes. This includes specifying the autoscaling policies based on CPU or memory metrics, as well as setting the minimum and maximum number of pod replicas.

Automating Scheduled Scaling with Cron Jobs

In addition to autoscaling based on resource utilization, you can further optimize costs by implementing scheduled scaling using Cron Jobs. By scaling down the minimum number of pod replicas during off-peak hours when demand is low, you can significantly reduce costs without impacting performance.

# Python script for scheduled scaling using Cron Jobs
import subprocess
import json
import sys
import boto3
import io
import os

def get_hpa_info(namespace):
    # Fetch HPA information using kubectl command
    hpa_output = subprocess.check_output(['kubectl', 'get', 'hpa', '-n', namespace, '-o', 'json'])
    hpa_data = json.loads(hpa_output)
    return hpa_data['items']


def save_pod_config_to_s3(namespace, min_pods, max_pods, env):
    # Save pod configuration to S3
    s3 = boto3.client('s3')
    bucket_name = f"terraform-{env}-state"
    filename = f"{namespace}_pod_config.json"
    
    # Convert JSON string to file-like object
    json_data = json.dumps({'min_pods': min_pods, 'max_pods': max_pods})
    json_bytes = json_data.encode('utf-8')
    json_fileobj = io.BytesIO(json_bytes)
    
    # Upload file-like object to S3
    s3.upload_fileobj(json_fileobj, bucket_name, f'{env}-pod-config/{filename}')


def scale_down_to_one_pod(namespace, env):
    # Temporarily set minReplicas in HPA to 1 and save pod configuration to S3
    hpa_info = get_hpa_info(namespace)
    min_pods = {}
    max_pods = {}
    for item in hpa_info:
        deployment_name = item['metadata']['name']
        min_replicas = item['spec'].get('minReplicas', 1)        
        min_pods[deployment_name] = min_replicas
    
    save_pod_config_to_s3(namespace, min_pods, max_pods, env)
    
    for item in hpa_info:
        deployment_name = item['metadata']['name']
        subprocess.run(['kubectl', 'patch', 'hpa', deployment_name, '-n', namespace, '--type=merge', '-p', '{"spec":{"minReplicas":1}}'])
    print("Temporarily set minReplicas in HPA to 1.")

def scale_down_to_zero_pod(namespace, env):
    # Temporarily set minReplicas in HPA to 0
    hpa_info = get_hpa_info(namespace)
    
    for item in hpa_info:
        deployment_name = item['metadata']['name']
        subprocess.run(['kubectl', 'patch', 'hpa', deployment_name, '-n', namespace, '--type=merge', '-p', '{"spec":{"minReplicas":0}}'])
    print("Temporarily set minReplicas in HPA to 0.")


def scale_to_default(namespace, env):
    try:
        # Restore min replicas in HPA to their original values
        s3 = boto3.client('s3')
        bucket_name = f"terraform-{env}-state"
        filename = f"{namespace}_pod_config.json"
        s3.download_file(bucket_name, f'{env}-pod-config/{filename}', filename)
        
        with open(filename, 'r') as file:
            pod_config = json.load(file)
            min_pods = pod_config['min_pods']
            
            for deployment_name, min_replicas in min_pods.items():
                subprocess.run(['kubectl', 'patch', 'hpa', deployment_name, '-n', namespace, '--type=merge', '-p', f'{{"spec":{{"minReplicas":{min_replicas}}}}}'])
                print(f"Reverted minReplicas for HPA {deployment_name} to {min_replicas}")
        
        print("Restored min replicas in HPA to their original values.")
    except Exception as e:
        print(f"Error scaling to default: {e}")


if __name__ == "__main__":

    if len(sys.argv) < 3:
        print("Usage: python script.py [action] [namespace]")
        sys.exit(1)

    action = sys.argv[1]
    namespace = sys.argv[2]
    env = sys.argv[3]

    if action == "scale_down_to_one_pod":
        scale_down_to_one_pod(namespace, env)
    elif action == "scale_to_default":
        scale_to_default(namespace, env)
    else:
        print("Invalid action. Please choose 'scale_down_to_one_pod' or 'scale_to_default'.")
        sys.exit(1)

In the provided Python script, we demonstrate how to use Cron Jobs to scale down the minimum number of pod replicas during off-peak hours and scale them back up during peak hours. This script can be deployed as a Kubernetes Cron Job and scheduled to run at specific times using the cron expression.

Cron Job terraform

# Terraform code for cron job

# Scale down min pods to 1 cron job
resource "kubernetes_cron_job_v1" "scale_down_cronjob" {
  metadata {
    name      = "scale-down-cronjob"
    namespace = "kube-system"
  }
  spec {
    concurrency_policy            = "Replace"
    failed_jobs_history_limit     = 5
    schedule                      = "30 12 * * *" # 6 PM IST
    starting_deadline_seconds     = 10
    successful_jobs_history_limit = 10
    job_template {
      metadata {}
      spec {
        backoff_limit              = 2
        ttl_seconds_after_finished = 10
        template {
          metadata {
            labels = {
              app = "scale-down-cronjob"
            }
          }
          spec {
          service_account_name = kubernetes_service_account.cronjob_service_account.metadata.0.name
          container {
              name    = "scale-down-pod"
              image   = "${var.account_number}.dkr.ecr.ap-south-1.amazonaws.com/scaling-pods-script-${var.env}:latest"
              command = ["python", "script.py", "scale_down_to_one_pod", "${var.namespace}", "${var.env}"]
            }
          }
        }
      }
    }
  }
}



# Scale to pods to default cron job
resource "kubernetes_cron_job_v1" "scale_to_default" {
  metadata {
    name      = "scale-to-default-cronjob"
    namespace = "kube-system"
  }
  spec {
    concurrency_policy            = "Replace"
    failed_jobs_history_limit     = 5
    schedule                      = "30 2 * * *" # 8 AM IST
    starting_deadline_seconds     = 10
    successful_jobs_history_limit = 10
    job_template {
      metadata {}
      spec {
        backoff_limit              = 2
        ttl_seconds_after_finished = 10
        template {
          metadata {
            labels = {
              app = "scale-to-default-cronjob"
            }
          }
          spec {
          service_account_name = kubernetes_service_account.cronjob_service_account.metadata.0.name
          container {
              name    = "scale-to-default-pod"
              image   = "${var.account_number}.dkr.ecr.ap-south-1.amazonaws.com/scaling-pods-script-${var.env}:latest"
              command = ["python", "script.py", "scale_to_default", "${var.namespace}", "${var.env}"]
            }
          }
        }
      }
    }
  }
}

With Terraform, you can automate the deployment of Cron Jobs in Kubernetes and schedule them to run at specific times using cron expressions. Below is the Terraform code snippet that provisions the necessary Kubernetes resources for deploying Cron Jobs to scale down and scale up the minimum number of pod replicas:

Conclusion

Cost optimization in AWS EKS is a continuous process that requires careful planning and automation. By leveraging Terraform for infrastructure-as-code and implementing strategies such as Horizontal Pod Autoscaling, Cluster Autoscaler, and scheduled scaling with Cron Jobs, you can effectively optimize costs while ensuring optimal performance and resource utilization in your Kubernetes clusters. Automating these processes not only saves time and effort but also enables you to adapt dynamically to changing workloads, ultimately leading to significant cost savings in the long run.

Optimizing costs in AWS EKS is not a one-time task but an ongoing effort that requires monitoring, analysis, and adjustments based on evolving requirements and usage patterns. By following the best practices outlined in this guide and continuously refining your cost optimization strategies, you can maximize the value of your AWS EKS investment while minimizing unnecessary expenses.

Mock Service Worker for Development and Testing

Sangeeta Saha

April 16, 2024

Authentication Using Microsoft Authentication Library (MSAL)

Sangeeta Saha

April 15, 2024

Micro Frontend using Single SPA and Qiankun

Sangeeta Saha

April 09, 2024

Get started now

Get a quote for your project.