Amazon Auto Scaling Essentials

Auto-Scaling essentials, Launch Templates, ASG, Load Balancers, and best practices

Key Concepts Essential AWS CLI Commands Creating Resources Querying Resources Modifying Resources Deleting Resources Architecture and Flows Complete Architecture Diagram Sequence Diagram: Automatic Scaling Health Check Flow Best Practices Checklist Security Cost Optimization Performance Reliability Operational Excellence Common Mistakes to Avoid Using instance types not eligible for Free Tier in your region User data with incompatible dependencies Grace period too short Not separating security groups between LB and instances Forgetting to configure health check endpoint in the application Creating infinite replacement cycles Hardcoding configuration in AMIs Not testing the Load Balancer before adding ASG Cost Considerations What Generates Costs How to Optimize Costs Free Tier Limits Integration with Other Services Additional Resources Official Documentation Relevant Whitepapers Recommended Tutorials

Auto-Scaling is a set of AWS services that automatically adjusts compute capacity (number of EC2 instances) based on current demand, ensuring availability and optimizing costs. It solves the problem of having to manually manage infrastructure during traffic spikes or low-demand periods.

When to use it: Applications with variable traffic, systems requiring high availability, and architectures that need automatic horizontal scaling.

Alternatives: Manual vertical scaling (changing instance type), containers with ECS/EKS auto-scaling, serverless (Lambda).

Key Concepts

Concept	Description
AMI (Amazon Machine Image)	Complete snapshot of an EC2 instance including OS, installed software, and code. Serves as a template to create identical instances
Launch Template	Reusable configuration that defines how to launch instances. Includes: instance type, AMI, security groups, user data Supports versioning
Auto-Scaling Group (ASG)	Collection of EC2 instances treated as a logical unit. Automatically maintains the desired number of instances based on policies
Desired Capacity	Number of instances the ASG tries to keep active at all times
Min/Max Size	Lower and upper limits of instances the ASG can maintain
Scaling Policy	Rule that defines when and how much to scale (add/remove instances)
Target Tracking Policy	Policy that maintains a specific metric (e.g., CPU) at a target value. Scales automatically as needed
Step Scaling Policy	Policy with multiple "steps" that responds proportionally based on metric severity
Health Check	Periodic verification that determines if an instance is working correctly
Grace Period	Time the ASG waits before verifying health checks on new instances. Allows them to fully start up
Target Group	Logical collection of instances receiving traffic from a Load Balancer. Includes health check configuration
Elastic Load Balancer (ELB)	Distributes incoming traffic among multiple EC2 instances across different Availability Zones
Connection Draining	Period where a terminating instance stops receiving new requests but completes existing ones

Essential AWS CLI Commands

Creating Resources

# Create AMI from existing instance
aws ec2 create-image \
    --instance-id i-1234567890abcdef0 \
    --name "myapp-v1.0.0" \
    --description "Production app with dependencies" \
    --no-reboot  # Avoids downtime

# Create Launch Template
aws ec2 create-launch-template \
    --launch-template-name myapp-template \
    --version-description "Initial version" \
    --launch-template-data '{
        "ImageId": "ami-0abc123",
        "InstanceType": "t3.micro",
        "SecurityGroupIds": ["sg-12345"],
        "UserData": "IyEvYmluL2Jhc2g...",
        "IamInstanceProfile": {"Name": "ec2-role"},
        "KeyName": "my-keypair"
    }'

# Create new version of Launch Template
aws ec2 create-launch-template-version \
    --launch-template-id lt-0abc123 \
    --version-description "Updated to t3.small" \
    --launch-template-data '{"InstanceType": "t3.small"}'

# Create Target Group
aws elbv2 create-target-group \
    --name myapp-targets \
    --protocol HTTP \
    --port 80 \
    --vpc-id vpc-12345 \
    --health-check-path /health \
    --health-check-interval-seconds 30 \
    --healthy-threshold-count 2 \
    --unhealthy-threshold-count 3

# Create Application Load Balancer
aws elbv2 create-load-balancer \
    --name myapp-lb \
    --subnets subnet-abc123 subnet-def456 \
    --security-groups sg-12345 \
    --scheme internet-facing \
    --type application

# Create Listener (connects LB with Target Group)
aws elbv2 create-listener \
    --load-balancer-arn arn:aws:elasticloadbalancing:... \
    --protocol HTTP \
    --port 80 \
    --default-actions Type=forward,TargetGroupArn=arn:aws:...

# Create Auto-Scaling Group
aws autoscaling create-auto-scaling-group \
    --auto-scaling-group-name myapp-asg \
    --launch-template LaunchTemplateId=lt-0abc123,Version='$Latest' \
    --min-size 2 \
    --max-size 10 \
    --desired-capacity 2 \
    --target-group-arns arn:aws:elasticloadbalancing:... \
    --vpc-zone-identifier "subnet-abc,subnet-def" \
    --health-check-type ELB \
    --health-check-grace-period 300

# Create Target Tracking Scaling Policy
aws autoscaling put-scaling-policy \
    --auto-scaling-group-name myapp-asg \
    --policy-name target-tracking-cpu \
    --policy-type TargetTrackingScaling \
    --target-tracking-configuration '{
        "PredefinedMetricSpecification": {
            "PredefinedMetricType": "ASGAverageCPUUtilization"
        },
        "TargetValue": 70.0
    }'

Querying Resources

# View your own AMIs
aws ec2 describe-images --owners self

# View Launch Templates
aws ec2 describe-launch-templates
aws ec2 describe-launch-template-versions --launch-template-id lt-0abc123

# View Auto-Scaling Groups
aws autoscaling describe-auto-scaling-groups \
    --auto-scaling-group-names myapp-asg

# View instances in the ASG
aws autoscaling describe-auto-scaling-groups \
    --auto-scaling-group-names myapp-asg \
    --query 'AutoScalingGroups[0].Instances[*].[InstanceId,LifecycleState,HealthStatus,AvailabilityZone]' \
    --output table

# View ASG activities (troubleshooting)
aws autoscaling describe-scaling-activities \
    --auto-scaling-group-name myapp-asg \
    --max-records 10

# View target health
aws elbv2 describe-target-health \
    --target-group-arn arn:aws:elasticloadbalancing:...

# View Load Balancers
aws elbv2 describe-load-balancers

Modifying Resources

# Update ASG (change capacity)
aws autoscaling update-auto-scaling-group \
    --auto-scaling-group-name myapp-asg \
    --min-size 3 \
    --max-size 15 \
    --desired-capacity 5

# Update ASG (change Launch Template)
aws autoscaling update-auto-scaling-group \
    --auto-scaling-group-name myapp-asg \
    --launch-template LaunchTemplateId=lt-0abc123,Version='$Latest'

# Manually change desired capacity
aws autoscaling set-desired-capacity \
    --auto-scaling-group-name myapp-asg \
    --desired-capacity 8

Deleting Resources

# Delete Auto-Scaling Group (terminates instances)
aws autoscaling delete-auto-scaling-group \
    --auto-scaling-group-name myapp-asg \
    --force-delete

# Delete Load Balancer
aws elbv2 delete-load-balancer \
    --load-balancer-arn arn:aws:elasticloadbalancing:...

# Delete Target Group
aws elbv2 delete-target-group \
    --target-group-arn arn:aws:elasticloadbalancing:...

# Delete Launch Template
aws ec2 delete-launch-template \
    --launch-template-id lt-0abc123

# Deregister AMI
aws ec2 deregister-image --image-id ami-0abc123

Architecture and Flows

Complete Architecture Diagram

Sequence Diagram: Automatic Scaling

Health Check Flow

Best Practices Checklist

Security

Instances behind the LB: Configure security groups so instances only accept traffic from the Load Balancer, not directly from the Internet
IAM Instance Profiles: Assign IAM roles to instances instead of hardcoding credentials
Secrets Management: Use AWS Secrets Manager or Parameter Store for sensitive variables, not user data
Layered Security Groups: Separate SGs for LB (accepts Internet) and instances (accepts only LB)
Secure Keypairs: Rotate keypairs regularly and use AWS Systems Manager Session Manager instead of direct SSH

Cost Optimization

Target Tracking over Step Scaling: More efficient, scales only what's needed
Appropriate min size: Don't set min-size higher than necessary
Scheduled Scaling: For predictable loads, scale up/down at known times
Appropriate instance types: Use free tier (t3.micro in sa-east-1) when applicable
Spot Instances: Consider mixing On-Demand with Spot for interruption-tolerant workloads

Performance

Multi-AZ deployment: Distribute instances across at least 2 Availability Zones
Health check interval: Not too frequent (consumes resources), typically 30s is adequate
Sufficient grace period: Minimum 5 minutes for applications with slow startup
Connection draining: Configure so in-progress requests complete before terminating instances
Optimized AMIs: Pre-install everything possible in the AMI to reduce startup time

Reliability

Robust health checks: The /health endpoint should verify critical dependencies (DB, Redis, etc.)
Desired capacity > 1: Always have at least 2 instances for fault tolerance
ELB health check type: More reliable than EC2 health checks for web applications
Unhealthy threshold: Configure to tolerate temporary failures (typically 3 consecutive failures)
Metrics monitoring: CloudWatch alarms for unusual scaling or unhealthy instances

Operational Excellence

Versioned Launch Templates: Use versions instead of modifying the existing template
Consistent tagging: Tags to identify ASG instances, environment, project
Centralized logs: Send application logs to CloudWatch Logs for troubleshooting
Document user data: Comment user data scripts for future maintenance
AMI testing: Validate AMIs before using them in production

Common Mistakes to Avoid

Using instance types not eligible for Free Tier in your region

Why it happens: Free tier eligibility varies by region (t2.micro in us-east-1, t3.micro in sa-east-1)

How to avoid it:

# Verify eligible types in your region
aws ec2 describe-instance-types \
    --filters "Name=free-tier-eligible,Values=true" \
    --region your-region

User data with incompatible dependencies

Why it happens: Installing software that requires library versions not available in the OS (e.g., Node.js 18 on Amazon Linux 2 with glibc 2.26)

How to avoid it:

Use official OS repos when possible
Test user data on a standalone instance before adding it to the Launch Template
Check logs with aws ec2 get-console-output --instance-id i-xxx

Grace period too short

Why it happens: The ASG terminates instances before the application fully starts

How to avoid it: Configure a grace period of at least 300 seconds (5 minutes) for typical applications. Increase if your app takes longer to start.

Not separating security groups between LB and instances

Why it happens: Opening port 80 to the Internet (0.0.0.0/0) directly on instances

How to avoid it:

# Instances: only accept from LB
--source-group sg-of-load-balancer

# DON'T do this on instances:
--cidr 0.0.0.0/0  # Insecure

Forgetting to configure health check endpoint in the application

Why it happens: The LB does health checks to /health but the application doesn't have that endpoint

How to avoid it: Implement a /health endpoint that:

Returns 200 OK when everything is fine
Verifies critical dependencies (DB, cache)
Is lightweight (no expensive operations)

Creating infinite replacement cycles

Why it happens: Instances fail health checks → ASG terminates them → new instances also fail → infinite cycle

How to avoid it: Monitor ASG activities:

aws autoscaling describe-scaling-activities --max-records 5

If you see continuous replacements, investigate user data logs immediately.

Hardcoding configuration in AMIs

Why it happens: Including production environment variables in the AMI, making it impossible to reuse for staging

How to avoid it:

AMI: only software and code
User data: dynamic configuration per environment
Secrets Manager/Parameter Store: credentials and secrets

Resource	Cost	Notes
Running EC2 instances	~$0.0104/hour (t3.micro in sa-east-1)	Multiplied by number of instances
Application Load Balancer	~ $0.0225/hour fixed + ~$ 0.008/LCU-hour	LCU = highest of: new connections, active connections, processed bytes, rule evaluations
Data Transfer OUT	~$0.15/GB	First 1GB free/month
Data Transfer between AZs	~$0.01/GB	Within the same region
Stored AMIs	~$0.05/GB-month	Associated EBS snapshots
Auto-Scaling service	FREE	You only pay for the resources it creates

How to Optimize Costs

Scale efficiently:

# Target Tracking instead of keeping idle instances
# Scales only when needed
--target-tracking-configuration TargetValue=70.0

Scheduled Scaling for predictable loads:

# Reduce to minimum outside business hours
aws autoscaling put-scheduled-update-group-action \
    --auto-scaling-group-name myapp-asg \
    --scheduled-action-name scale-down-night \
    --recurrence "0 22 * * *" \
    --min-size 1 --desired-capacity 1

Delete unused resources:

Old AMIs and their snapshots
Testing Load Balancers
Development ASGs that aren't actively used

Reserved Instances or Savings Plans:

If you maintain predictable base load (e.g., min-size 2 always), consider Reserved Instances for up to 72% savings

Free Tier Limits

EC2 Free Tier (first 12 months):

750 hours/month of t2.micro (us-east-1) or t3.micro (other regions)
2 t3.micro instances running 24/7 = 1440 hours/month → exceeds free tier
1 instance 24/7 = within free tier

ELB Free Tier:

Application Load Balancer: NOT included in free tier (always generates cost)
Classic Load Balancer: 750 hours/month + 15GB processed (first 12 months)

Recommendation for learning:

Use 1-2 t3.micro instances (within free tier)
Delete resources immediately after practicing
Consider alternatives without LB for labs (use direct IPs)

Integration with Other Services

Service	How it integrates	Typical use case
Route53	DNS pointing to the Load Balancer	`www.myapp.com` → ALB DNS name with automatic failover
CloudWatch	Sends instance and ASG metrics	Monitor CPU/RAM, create alarms, centralized logs
IAM	Instance Profiles assigned via Launch Template	Give instances permissions to access S3, DynamoDB, Secrets Manager
VPC	ASG launches instances in specific subnets	Multi-AZ deployment for high availability
RDS	Instances connect to shared DB	Stateless application that reads/writes to centralized database
ElastiCache	Instances use shared cache	User sessions, frequently queried cache
S3	Instances upload/download files	User uploads, logs, static assets
Secrets Manager	User data retrieves secrets at startup	Database passwords, API keys without hardcoding
CloudFormation	Defines entire infrastructure as code	Deploy complete stack (VPC + ASG + LB + RDS) with one template
CodeDeploy	Automatic deploys to ASG instances	CI/CD: push code → CodeDeploy updates instances without downtime
SNS	ASG sends event notifications	Alerts when instance is created/terminated, when it scales
Lambda	Triggers on ASG events	Execute custom logic when instance starts (e.g., register in external service)