OUTDATED!! This feature is now incorporated into ECS.
See the following links for setting up the
OUTDATED
This article is a solution for replacing ECS instances without having them terminate with running tasks.
My solution is based on Amazons ECS Container draining code.
The advantages over this solution vs. AWS’s implementation
- Efficiency: Event driven and does not use
sleep(5)
🤯 - Maintainable: Deployed with Terraform not CloudFormation 🤮
Overview of Resources
Logic Flow
- The auto scaling group triggers a lifecycle hook when it plans to remove an instance and the instance is on
terminate:wait
- The lifecycle hooks triggers an SNS topic
- Subscribed to the SNS topic is a lambda function
- The function sets the ECS instance to drain
- Once the instance is fully drained (no running or pending tasks) the EventBridge rule
- The rule triggers the second lambda function
- The function tells the auto scaling group to continue the termination of the instance (
terminate:proceed
)
Code
To preface this part I am not a developer, so my code could probably be improved.
If you have any improvement please contribute to improve them. The repository can be found in this repository.
Drain ECS Instance
The function can be found here on github.
- First the function take a SNS input. From that it checks if the instance is terminating and the ID of the EC2 server
- Then it gets the ECS instance ID if from the ECS server ID
- Finally, it drains the instance using the ECS instance ID
Complete ECS Lifecycle
The function can be found here on github.
- When the function gets triggered by the EventBridge rule it gets the auto scaling groups name from the tag on the EC2 server
- Next it gets the lifecycle hooks name
- Finally, it tells the auto scaling group to proceed with the termination of the EC2 server
Spinning it Up
Using Terraform we can deploy all the resource. The only two resources not created in the module is the ECS cluster and the Auto Scaling Group.
The full Terraform module can be found in github.
module "ecs_asg_lifecycle" {
source = "git::https://github.com/andrew-aiken/website-ref.git//ecs_asg_lifecycle"
autoscaling_group_name = "ecs-asg-name"
ecs_cluster_name = "ecs-cluster-name"
drain_ecs_instance_name = "drain-ecs-instance"
complete_ecs_lifecycle_name = "complete-ecs-lifecycle"
tags = {
key = "value"
}
}
terraform {
required_version = "1.5.5"
backend "local" {
path = "terraform.tfstate"
}
required_providers {
archive = {
source = "hashicorp/archive"
version = "2.4.0"
}
aws = {
source = "hashicorp/aws"
version = "5.12.0"
}
}
}
provider "aws" {
region = "us-west-2"
}
Put the code above in a file (main.tf
).
To deploy the code you need AWS CLI setup and permissions to deploy the resources.
terraform init
terraform apply
There are 16 resources the module will deploy. Once you have reviewed all the changes type yes and Terraform will deploy the resources.