Amazon Auto Scaling Essentials
Auto-Scaling essentials, Launch Templates, ASG, Load Balancers, and best practices
Auto-Scaling is a set of AWS services that automatically adjusts compute capacity (number of EC2 instances) based on current demand, ensuring availability and optimizing costs. It solves the problem of having to manually manage infrastructure during traffic spikes or low-demand periods.
When to use it: Applications with variable traffic, systems requiring high availability, and architectures that need automatic horizontal scaling.
Alternatives: Manual vertical scaling (changing instance type), containers with ECS/EKS auto-scaling, serverless (Lambda).
Key Concepts
| Concept | Description |
|---|---|
| AMI (Amazon Machine Image) | Complete snapshot of an EC2 instance including OS, installed software, and code. Serves as a template to create identical instances |
| Launch Template | Reusable configuration that defines how to launch instances. Includes: instance type, AMI, security groups, user data Supports versioning |
| Auto-Scaling Group (ASG) | Collection of EC2 instances treated as a logical unit. Automatically maintains the desired number of instances based on policies |
| Desired Capacity | Number of instances the ASG tries to keep active at all times |
| Min/Max Size | Lower and upper limits of instances the ASG can maintain |
| Scaling Policy | Rule that defines when and how much to scale (add/remove instances) |
| Target Tracking Policy | Policy that maintains a specific metric (e.g., CPU) at a target value. Scales automatically as needed |
| Step Scaling Policy | Policy with multiple "steps" that responds proportionally based on metric severity |
| Health Check | Periodic verification that determines if an instance is working correctly |
| Grace Period | Time the ASG waits before verifying health checks on new instances. Allows them to fully start up |
| Target Group | Logical collection of instances receiving traffic from a Load Balancer. Includes health check configuration |
| Elastic Load Balancer (ELB) | Distributes incoming traffic among multiple EC2 instances across different Availability Zones |
| Connection Draining | Period where a terminating instance stops receiving new requests but completes existing ones |
Essential AWS CLI Commands
Creating Resources
# Create AMI from existing instance
aws ec2 create-image \
--instance-id i-1234567890abcdef0 \
--name "myapp-v1.0.0" \
--description "Production app with dependencies" \
--no-reboot # Avoids downtime
# Create Launch Template
aws ec2 create-launch-template \
--launch-template-name myapp-template \
--version-description "Initial version" \
--launch-template-data '{
"ImageId": "ami-0abc123",
"InstanceType": "t3.micro",
"SecurityGroupIds": ["sg-12345"],
"UserData": "IyEvYmluL2Jhc2g...",
"IamInstanceProfile": {"Name": "ec2-role"},
"KeyName": "my-keypair"
}'
# Create new version of Launch Template
aws ec2 create-launch-template-version \
--launch-template-id lt-0abc123 \
--version-description "Updated to t3.small" \
--launch-template-data '{"InstanceType": "t3.small"}'
# Create Target Group
aws elbv2 create-target-group \
--name myapp-targets \
--protocol HTTP \
--port 80 \
--vpc-id vpc-12345 \
--health-check-path /health \
--health-check-interval-seconds 30 \
--healthy-threshold-count 2 \
--unhealthy-threshold-count 3
# Create Application Load Balancer
aws elbv2 create-load-balancer \
--name myapp-lb \
--subnets subnet-abc123 subnet-def456 \
--security-groups sg-12345 \
--scheme internet-facing \
--type application
# Create Listener (connects LB with Target Group)
aws elbv2 create-listener \
--load-balancer-arn arn:aws:elasticloadbalancing:... \
--protocol HTTP \
--port 80 \
--default-actions Type=forward,TargetGroupArn=arn:aws:...
# Create Auto-Scaling Group
aws autoscaling create-auto-scaling-group \
--auto-scaling-group-name myapp-asg \
--launch-template LaunchTemplateId=lt-0abc123,Version='$Latest' \
--min-size 2 \
--max-size 10 \
--desired-capacity 2 \
--target-group-arns arn:aws:elasticloadbalancing:... \
--vpc-zone-identifier "subnet-abc,subnet-def" \
--health-check-type ELB \
--health-check-grace-period 300
# Create Target Tracking Scaling Policy
aws autoscaling put-scaling-policy \
--auto-scaling-group-name myapp-asg \
--policy-name target-tracking-cpu \
--policy-type TargetTrackingScaling \
--target-tracking-configuration '{
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ASGAverageCPUUtilization"
},
"TargetValue": 70.0
}'Querying Resources
# View your own AMIs
aws ec2 describe-images --owners self
# View Launch Templates
aws ec2 describe-launch-templates
aws ec2 describe-launch-template-versions --launch-template-id lt-0abc123
# View Auto-Scaling Groups
aws autoscaling describe-auto-scaling-groups \
--auto-scaling-group-names myapp-asg
# View instances in the ASG
aws autoscaling describe-auto-scaling-groups \
--auto-scaling-group-names myapp-asg \
--query 'AutoScalingGroups[0].Instances[*].[InstanceId,LifecycleState,HealthStatus,AvailabilityZone]' \
--output table
# View ASG activities (troubleshooting)
aws autoscaling describe-scaling-activities \
--auto-scaling-group-name myapp-asg \
--max-records 10
# View target health
aws elbv2 describe-target-health \
--target-group-arn arn:aws:elasticloadbalancing:...
# View Load Balancers
aws elbv2 describe-load-balancersModifying Resources
# Update ASG (change capacity)
aws autoscaling update-auto-scaling-group \
--auto-scaling-group-name myapp-asg \
--min-size 3 \
--max-size 15 \
--desired-capacity 5
# Update ASG (change Launch Template)
aws autoscaling update-auto-scaling-group \
--auto-scaling-group-name myapp-asg \
--launch-template LaunchTemplateId=lt-0abc123,Version='$Latest'
# Manually change desired capacity
aws autoscaling set-desired-capacity \
--auto-scaling-group-name myapp-asg \
--desired-capacity 8Deleting Resources
# Delete Auto-Scaling Group (terminates instances)
aws autoscaling delete-auto-scaling-group \
--auto-scaling-group-name myapp-asg \
--force-delete
# Delete Load Balancer
aws elbv2 delete-load-balancer \
--load-balancer-arn arn:aws:elasticloadbalancing:...
# Delete Target Group
aws elbv2 delete-target-group \
--target-group-arn arn:aws:elasticloadbalancing:...
# Delete Launch Template
aws ec2 delete-launch-template \
--launch-template-id lt-0abc123
# Deregister AMI
aws ec2 deregister-image --image-id ami-0abc123Architecture and Flows
Complete Architecture Diagram
Sequence Diagram: Automatic Scaling
Health Check Flow
Best Practices Checklist
Security
- Instances behind the LB: Configure security groups so instances only accept traffic from the Load Balancer, not directly from the Internet
- IAM Instance Profiles: Assign IAM roles to instances instead of hardcoding credentials
- Secrets Management: Use AWS Secrets Manager or Parameter Store for sensitive variables, not user data
- Layered Security Groups: Separate SGs for LB (accepts Internet) and instances (accepts only LB)
- Secure Keypairs: Rotate keypairs regularly and use AWS Systems Manager Session Manager instead of direct SSH
Cost Optimization
- Target Tracking over Step Scaling: More efficient, scales only what's needed
- Appropriate min size: Don't set min-size higher than necessary
- Scheduled Scaling: For predictable loads, scale up/down at known times
- Appropriate instance types: Use free tier (t3.micro in sa-east-1) when applicable
- Spot Instances: Consider mixing On-Demand with Spot for interruption-tolerant workloads
Performance
- Multi-AZ deployment: Distribute instances across at least 2 Availability Zones
- Health check interval: Not too frequent (consumes resources), typically 30s is adequate
- Sufficient grace period: Minimum 5 minutes for applications with slow startup
- Connection draining: Configure so in-progress requests complete before terminating instances
- Optimized AMIs: Pre-install everything possible in the AMI to reduce startup time
Reliability
- Robust health checks: The
/healthendpoint should verify critical dependencies (DB, Redis, etc.) - Desired capacity > 1: Always have at least 2 instances for fault tolerance
- ELB health check type: More reliable than EC2 health checks for web applications
- Unhealthy threshold: Configure to tolerate temporary failures (typically 3 consecutive failures)
- Metrics monitoring: CloudWatch alarms for unusual scaling or unhealthy instances
Operational Excellence
- Versioned Launch Templates: Use versions instead of modifying the existing template
- Consistent tagging: Tags to identify ASG instances, environment, project
- Centralized logs: Send application logs to CloudWatch Logs for troubleshooting
- Document user data: Comment user data scripts for future maintenance
- AMI testing: Validate AMIs before using them in production
Common Mistakes to Avoid
Using instance types not eligible for Free Tier in your region
Why it happens: Free tier eligibility varies by region (t2.micro in us-east-1, t3.micro in sa-east-1)
How to avoid it:
# Verify eligible types in your region
aws ec2 describe-instance-types \
--filters "Name=free-tier-eligible,Values=true" \
--region your-regionUser data with incompatible dependencies
Why it happens: Installing software that requires library versions not available in the OS (e.g., Node.js 18 on Amazon Linux 2 with glibc 2.26)
How to avoid it:
- Use official OS repos when possible
- Test user data on a standalone instance before adding it to the Launch Template
- Check logs with
aws ec2 get-console-output --instance-id i-xxx
Grace period too short
Why it happens: The ASG terminates instances before the application fully starts
How to avoid it: Configure a grace period of at least 300 seconds (5 minutes) for typical applications. Increase if your app takes longer to start.
Not separating security groups between LB and instances
Why it happens: Opening port 80 to the Internet (0.0.0.0/0) directly on instances
How to avoid it:
# Instances: only accept from LB
--source-group sg-of-load-balancer
# DON'T do this on instances:
--cidr 0.0.0.0/0 # InsecureForgetting to configure health check endpoint in the application
Why it happens: The LB does health checks to /health but the application doesn't have that endpoint
How to avoid it: Implement a /health endpoint that:
- Returns 200 OK when everything is fine
- Verifies critical dependencies (DB, cache)
- Is lightweight (no expensive operations)
Creating infinite replacement cycles
Why it happens: Instances fail health checks → ASG terminates them → new instances also fail → infinite cycle
How to avoid it: Monitor ASG activities:
aws autoscaling describe-scaling-activities --max-records 5If you see continuous replacements, investigate user data logs immediately.
Hardcoding configuration in AMIs
Why it happens: Including production environment variables in the AMI, making it impossible to reuse for staging
How to avoid it:
- AMI: only software and code
- User data: dynamic configuration per environment
- Secrets Manager/Parameter Store: credentials and secrets
Not testing the Load Balancer before adding ASG
Why it happens: Networking/security group issues aren't detected until everything is configured
How to avoid it: Create a manual instance, register it in the Target Group, verify health checks, THEN create the ASG.
Cost Considerations
What Generates Costs
| Resource | Cost | Notes |
|---|---|---|
| Running EC2 instances | ~$0.0104/hour (t3.micro in sa-east-1) | Multiplied by number of instances |
| Application Load Balancer | ~0.008/LCU-hour | LCU = highest of: new connections, active connections, processed bytes, rule evaluations |
| Data Transfer OUT | ~$0.15/GB | First 1GB free/month |
| Data Transfer between AZs | ~$0.01/GB | Within the same region |
| Stored AMIs | ~$0.05/GB-month | Associated EBS snapshots |
| Auto-Scaling service | FREE | You only pay for the resources it creates |
How to Optimize Costs
Scale efficiently:
# Target Tracking instead of keeping idle instances
# Scales only when needed
--target-tracking-configuration TargetValue=70.0Scheduled Scaling for predictable loads:
# Reduce to minimum outside business hours
aws autoscaling put-scheduled-update-group-action \
--auto-scaling-group-name myapp-asg \
--scheduled-action-name scale-down-night \
--recurrence "0 22 * * *" \
--min-size 1 --desired-capacity 1Delete unused resources:
- Old AMIs and their snapshots
- Testing Load Balancers
- Development ASGs that aren't actively used
Reserved Instances or Savings Plans:
- If you maintain predictable base load (e.g., min-size 2 always), consider Reserved Instances for up to 72% savings
Free Tier Limits
EC2 Free Tier (first 12 months):
- 750 hours/month of t2.micro (us-east-1) or t3.micro (other regions)
- 2 t3.micro instances running 24/7 = 1440 hours/month → exceeds free tier
- 1 instance 24/7 = within free tier
ELB Free Tier:
- Application Load Balancer: NOT included in free tier (always generates cost)
- Classic Load Balancer: 750 hours/month + 15GB processed (first 12 months)
Recommendation for learning:
- Use 1-2 t3.micro instances (within free tier)
- Delete resources immediately after practicing
- Consider alternatives without LB for labs (use direct IPs)
Integration with Other Services
| Service | How it integrates | Typical use case |
|---|---|---|
| Route53 | DNS pointing to the Load Balancer | www.myapp.com → ALB DNS name with automatic failover |
| CloudWatch | Sends instance and ASG metrics | Monitor CPU/RAM, create alarms, centralized logs |
| IAM | Instance Profiles assigned via Launch Template | Give instances permissions to access S3, DynamoDB, Secrets Manager |
| VPC | ASG launches instances in specific subnets | Multi-AZ deployment for high availability |
| RDS | Instances connect to shared DB | Stateless application that reads/writes to centralized database |
| ElastiCache | Instances use shared cache | User sessions, frequently queried cache |
| S3 | Instances upload/download files | User uploads, logs, static assets |
| Secrets Manager | User data retrieves secrets at startup | Database passwords, API keys without hardcoding |
| CloudFormation | Defines entire infrastructure as code | Deploy complete stack (VPC + ASG + LB + RDS) with one template |
| CodeDeploy | Automatic deploys to ASG instances | CI/CD: push code → CodeDeploy updates instances without downtime |
| SNS | ASG sends event notifications | Alerts when instance is created/terminated, when it scales |
| Lambda | Triggers on ASG events | Execute custom logic when instance starts (e.g., register in external service) |
Additional Resources
Official Documentation
Relevant Whitepapers
- AWS Well-Architected Framework - Reliability Pillar
- Overview of Amazon Web Services - Auto Scaling
- Best Practices for Amazon EC2