Amazon Auto Scaling Essentials

Auto-Scaling essentials, Launch Templates, ASG, Load Balancers, and best practices

@geomenaThu Jul 10 20251,015 views

Auto-Scaling is a set of AWS services that automatically adjusts compute capacity (number of EC2 instances) based on current demand, ensuring availability and optimizing costs. It solves the problem of having to manually manage infrastructure during traffic spikes or low-demand periods.

When to use it: Applications with variable traffic, systems requiring high availability, and architectures that need automatic horizontal scaling.

Alternatives: Manual vertical scaling (changing instance type), containers with ECS/EKS auto-scaling, serverless (Lambda).

Key Concepts

ConceptDescription
AMI (Amazon Machine Image)Complete snapshot of an EC2 instance including OS, installed software, and code. Serves as a template to create identical instances
Launch TemplateReusable configuration that defines how to launch instances. Includes: instance type, AMI, security groups, user data Supports versioning
Auto-Scaling Group (ASG)Collection of EC2 instances treated as a logical unit. Automatically maintains the desired number of instances based on policies
Desired CapacityNumber of instances the ASG tries to keep active at all times
Min/Max SizeLower and upper limits of instances the ASG can maintain
Scaling PolicyRule that defines when and how much to scale (add/remove instances)
Target Tracking PolicyPolicy that maintains a specific metric (e.g., CPU) at a target value. Scales automatically as needed
Step Scaling PolicyPolicy with multiple "steps" that responds proportionally based on metric severity
Health CheckPeriodic verification that determines if an instance is working correctly
Grace PeriodTime the ASG waits before verifying health checks on new instances. Allows them to fully start up
Target GroupLogical collection of instances receiving traffic from a Load Balancer. Includes health check configuration
Elastic Load Balancer (ELB)Distributes incoming traffic among multiple EC2 instances across different Availability Zones
Connection DrainingPeriod where a terminating instance stops receiving new requests but completes existing ones

Essential AWS CLI Commands

Creating Resources

# Create AMI from existing instance
aws ec2 create-image \
    --instance-id i-1234567890abcdef0 \
    --name "myapp-v1.0.0" \
    --description "Production app with dependencies" \
    --no-reboot  # Avoids downtime

# Create Launch Template
aws ec2 create-launch-template \
    --launch-template-name myapp-template \
    --version-description "Initial version" \
    --launch-template-data '{
        "ImageId": "ami-0abc123",
        "InstanceType": "t3.micro",
        "SecurityGroupIds": ["sg-12345"],
        "UserData": "IyEvYmluL2Jhc2g...",
        "IamInstanceProfile": {"Name": "ec2-role"},
        "KeyName": "my-keypair"
    }'

# Create new version of Launch Template
aws ec2 create-launch-template-version \
    --launch-template-id lt-0abc123 \
    --version-description "Updated to t3.small" \
    --launch-template-data '{"InstanceType": "t3.small"}'

# Create Target Group
aws elbv2 create-target-group \
    --name myapp-targets \
    --protocol HTTP \
    --port 80 \
    --vpc-id vpc-12345 \
    --health-check-path /health \
    --health-check-interval-seconds 30 \
    --healthy-threshold-count 2 \
    --unhealthy-threshold-count 3

# Create Application Load Balancer
aws elbv2 create-load-balancer \
    --name myapp-lb \
    --subnets subnet-abc123 subnet-def456 \
    --security-groups sg-12345 \
    --scheme internet-facing \
    --type application

# Create Listener (connects LB with Target Group)
aws elbv2 create-listener \
    --load-balancer-arn arn:aws:elasticloadbalancing:... \
    --protocol HTTP \
    --port 80 \
    --default-actions Type=forward,TargetGroupArn=arn:aws:...

# Create Auto-Scaling Group
aws autoscaling create-auto-scaling-group \
    --auto-scaling-group-name myapp-asg \
    --launch-template LaunchTemplateId=lt-0abc123,Version='$Latest' \
    --min-size 2 \
    --max-size 10 \
    --desired-capacity 2 \
    --target-group-arns arn:aws:elasticloadbalancing:... \
    --vpc-zone-identifier "subnet-abc,subnet-def" \
    --health-check-type ELB \
    --health-check-grace-period 300

# Create Target Tracking Scaling Policy
aws autoscaling put-scaling-policy \
    --auto-scaling-group-name myapp-asg \
    --policy-name target-tracking-cpu \
    --policy-type TargetTrackingScaling \
    --target-tracking-configuration '{
        "PredefinedMetricSpecification": {
            "PredefinedMetricType": "ASGAverageCPUUtilization"
        },
        "TargetValue": 70.0
    }'

Querying Resources

# View your own AMIs
aws ec2 describe-images --owners self

# View Launch Templates
aws ec2 describe-launch-templates
aws ec2 describe-launch-template-versions --launch-template-id lt-0abc123

# View Auto-Scaling Groups
aws autoscaling describe-auto-scaling-groups \
    --auto-scaling-group-names myapp-asg

# View instances in the ASG
aws autoscaling describe-auto-scaling-groups \
    --auto-scaling-group-names myapp-asg \
    --query 'AutoScalingGroups[0].Instances[*].[InstanceId,LifecycleState,HealthStatus,AvailabilityZone]' \
    --output table

# View ASG activities (troubleshooting)
aws autoscaling describe-scaling-activities \
    --auto-scaling-group-name myapp-asg \
    --max-records 10

# View target health
aws elbv2 describe-target-health \
    --target-group-arn arn:aws:elasticloadbalancing:...

# View Load Balancers
aws elbv2 describe-load-balancers

Modifying Resources

# Update ASG (change capacity)
aws autoscaling update-auto-scaling-group \
    --auto-scaling-group-name myapp-asg \
    --min-size 3 \
    --max-size 15 \
    --desired-capacity 5

# Update ASG (change Launch Template)
aws autoscaling update-auto-scaling-group \
    --auto-scaling-group-name myapp-asg \
    --launch-template LaunchTemplateId=lt-0abc123,Version='$Latest'

# Manually change desired capacity
aws autoscaling set-desired-capacity \
    --auto-scaling-group-name myapp-asg \
    --desired-capacity 8

Deleting Resources

# Delete Auto-Scaling Group (terminates instances)
aws autoscaling delete-auto-scaling-group \
    --auto-scaling-group-name myapp-asg \
    --force-delete

# Delete Load Balancer
aws elbv2 delete-load-balancer \
    --load-balancer-arn arn:aws:elasticloadbalancing:...

# Delete Target Group
aws elbv2 delete-target-group \
    --target-group-arn arn:aws:elasticloadbalancing:...

# Delete Launch Template
aws ec2 delete-launch-template \
    --launch-template-id lt-0abc123

# Deregister AMI
aws ec2 deregister-image --image-id ami-0abc123

Architecture and Flows

Complete Architecture Diagram

Sequence Diagram: Automatic Scaling

Health Check Flow

Best Practices Checklist

Security

  • Instances behind the LB: Configure security groups so instances only accept traffic from the Load Balancer, not directly from the Internet
  • IAM Instance Profiles: Assign IAM roles to instances instead of hardcoding credentials
  • Secrets Management: Use AWS Secrets Manager or Parameter Store for sensitive variables, not user data
  • Layered Security Groups: Separate SGs for LB (accepts Internet) and instances (accepts only LB)
  • Secure Keypairs: Rotate keypairs regularly and use AWS Systems Manager Session Manager instead of direct SSH

Cost Optimization

  • Target Tracking over Step Scaling: More efficient, scales only what's needed
  • Appropriate min size: Don't set min-size higher than necessary
  • Scheduled Scaling: For predictable loads, scale up/down at known times
  • Appropriate instance types: Use free tier (t3.micro in sa-east-1) when applicable
  • Spot Instances: Consider mixing On-Demand with Spot for interruption-tolerant workloads

Performance

  • Multi-AZ deployment: Distribute instances across at least 2 Availability Zones
  • Health check interval: Not too frequent (consumes resources), typically 30s is adequate
  • Sufficient grace period: Minimum 5 minutes for applications with slow startup
  • Connection draining: Configure so in-progress requests complete before terminating instances
  • Optimized AMIs: Pre-install everything possible in the AMI to reduce startup time

Reliability

  • Robust health checks: The /health endpoint should verify critical dependencies (DB, Redis, etc.)
  • Desired capacity > 1: Always have at least 2 instances for fault tolerance
  • ELB health check type: More reliable than EC2 health checks for web applications
  • Unhealthy threshold: Configure to tolerate temporary failures (typically 3 consecutive failures)
  • Metrics monitoring: CloudWatch alarms for unusual scaling or unhealthy instances

Operational Excellence

  • Versioned Launch Templates: Use versions instead of modifying the existing template
  • Consistent tagging: Tags to identify ASG instances, environment, project
  • Centralized logs: Send application logs to CloudWatch Logs for troubleshooting
  • Document user data: Comment user data scripts for future maintenance
  • AMI testing: Validate AMIs before using them in production

Common Mistakes to Avoid

Using instance types not eligible for Free Tier in your region

Why it happens: Free tier eligibility varies by region (t2.micro in us-east-1, t3.micro in sa-east-1)

How to avoid it:

# Verify eligible types in your region
aws ec2 describe-instance-types \
    --filters "Name=free-tier-eligible,Values=true" \
    --region your-region

User data with incompatible dependencies

Why it happens: Installing software that requires library versions not available in the OS (e.g., Node.js 18 on Amazon Linux 2 with glibc 2.26)

How to avoid it:

  • Use official OS repos when possible
  • Test user data on a standalone instance before adding it to the Launch Template
  • Check logs with aws ec2 get-console-output --instance-id i-xxx

Grace period too short

Why it happens: The ASG terminates instances before the application fully starts

How to avoid it: Configure a grace period of at least 300 seconds (5 minutes) for typical applications. Increase if your app takes longer to start.

Not separating security groups between LB and instances

Why it happens: Opening port 80 to the Internet (0.0.0.0/0) directly on instances

How to avoid it:

# Instances: only accept from LB
--source-group sg-of-load-balancer

# DON'T do this on instances:
--cidr 0.0.0.0/0  # Insecure

Forgetting to configure health check endpoint in the application

Why it happens: The LB does health checks to /health but the application doesn't have that endpoint

How to avoid it: Implement a /health endpoint that:

  • Returns 200 OK when everything is fine
  • Verifies critical dependencies (DB, cache)
  • Is lightweight (no expensive operations)

Creating infinite replacement cycles

Why it happens: Instances fail health checks → ASG terminates them → new instances also fail → infinite cycle

How to avoid it: Monitor ASG activities:

aws autoscaling describe-scaling-activities --max-records 5

If you see continuous replacements, investigate user data logs immediately.

Hardcoding configuration in AMIs

Why it happens: Including production environment variables in the AMI, making it impossible to reuse for staging

How to avoid it:

  • AMI: only software and code
  • User data: dynamic configuration per environment
  • Secrets Manager/Parameter Store: credentials and secrets

Not testing the Load Balancer before adding ASG

Why it happens: Networking/security group issues aren't detected until everything is configured

How to avoid it: Create a manual instance, register it in the Target Group, verify health checks, THEN create the ASG.

Cost Considerations

What Generates Costs

ResourceCostNotes
Running EC2 instances~$0.0104/hour (t3.micro in sa-east-1)Multiplied by number of instances
Application Load Balancer~0.0225/hourfixed+ 0.0225/hour fixed + ~0.008/LCU-hourLCU = highest of: new connections, active connections, processed bytes, rule evaluations
Data Transfer OUT~$0.15/GBFirst 1GB free/month
Data Transfer between AZs~$0.01/GBWithin the same region
Stored AMIs~$0.05/GB-monthAssociated EBS snapshots
Auto-Scaling serviceFREEYou only pay for the resources it creates

How to Optimize Costs

Scale efficiently:

# Target Tracking instead of keeping idle instances
# Scales only when needed
--target-tracking-configuration TargetValue=70.0

Scheduled Scaling for predictable loads:

# Reduce to minimum outside business hours
aws autoscaling put-scheduled-update-group-action \
    --auto-scaling-group-name myapp-asg \
    --scheduled-action-name scale-down-night \
    --recurrence "0 22 * * *" \
    --min-size 1 --desired-capacity 1

Delete unused resources:

  • Old AMIs and their snapshots
  • Testing Load Balancers
  • Development ASGs that aren't actively used

Reserved Instances or Savings Plans:

  • If you maintain predictable base load (e.g., min-size 2 always), consider Reserved Instances for up to 72% savings

Free Tier Limits

EC2 Free Tier (first 12 months):

  • 750 hours/month of t2.micro (us-east-1) or t3.micro (other regions)
  • 2 t3.micro instances running 24/7 = 1440 hours/month → exceeds free tier
  • 1 instance 24/7 = within free tier

ELB Free Tier:

  • Application Load Balancer: NOT included in free tier (always generates cost)
  • Classic Load Balancer: 750 hours/month + 15GB processed (first 12 months)

Recommendation for learning:

  • Use 1-2 t3.micro instances (within free tier)
  • Delete resources immediately after practicing
  • Consider alternatives without LB for labs (use direct IPs)

Integration with Other Services

ServiceHow it integratesTypical use case
Route53DNS pointing to the Load Balancerwww.myapp.com → ALB DNS name with automatic failover
CloudWatchSends instance and ASG metricsMonitor CPU/RAM, create alarms, centralized logs
IAMInstance Profiles assigned via Launch TemplateGive instances permissions to access S3, DynamoDB, Secrets Manager
VPCASG launches instances in specific subnetsMulti-AZ deployment for high availability
RDSInstances connect to shared DBStateless application that reads/writes to centralized database
ElastiCacheInstances use shared cacheUser sessions, frequently queried cache
S3Instances upload/download filesUser uploads, logs, static assets
Secrets ManagerUser data retrieves secrets at startupDatabase passwords, API keys without hardcoding
CloudFormationDefines entire infrastructure as codeDeploy complete stack (VPC + ASG + LB + RDS) with one template
CodeDeployAutomatic deploys to ASG instancesCI/CD: push code → CodeDeploy updates instances without downtime
SNSASG sends event notificationsAlerts when instance is created/terminated, when it scales
LambdaTriggers on ASG eventsExecute custom logic when instance starts (e.g., register in external service)

Additional Resources

Official Documentation

Relevant Whitepapers