Amazon Auto Scaling: Architecting Elastic Infrastructure on AWS

Auto Scaling with Launch Templates, ASGs, Elastic Load Balancers, scaling policies, and best practices

@geomenaThu Jul 10 2025#aws-roadmap#auto-scaling#infrastructure925 views

Modern production workloads rarely sustain a uniform traffic profile. Demand fluctuates throughout the day, surges unpredictably during promotional events, and contracts to near-zero during off-peak hours. Amazon Auto Scaling addresses this fundamental operational challenge by dynamically adjusting compute capacity — the number of running EC2 instances — in direct response to real-time demand signals. Rather than provisioning for peak capacity and absorbing waste, or under-provisioning and risking degraded user experience, Auto Scaling maintains the precise level of infrastructure your application requires at any given moment.

When to deploy Auto Scaling: Applications exhibiting variable traffic patterns, systems requiring high availability guarantees, and architectures demanding automatic horizontal scaling all benefit from this service.

Alternatives worth evaluating: Manual vertical scaling via instance type changes, container-based auto-scaling through ECS or EKS, and serverless compute via AWS Lambda each address overlapping but distinct use cases.

Key Concepts

ConceptDescription
AMIA complete snapshot of an EC2 instance — operating system, installed software, and application code — serving as an immutable template for launching identical instances
Launch TemplateA reusable, versioned configuration that defines how to launch instances, including instance type, AMI, security groups, user data, and IAM profiles
Auto Scaling GroupA logical collection of EC2 instances managed as a single unit, automatically maintaining the desired instance count based on defined policies
Desired CapacityThe number of instances the ASG actively maintains at all times
Min/Max SizeThe lower and upper bounds constraining the number of instances the ASG can provision
Scaling PolicyA rule defining when and by how much the ASG should scale — adding or removing instances
Target Tracking PolicyA policy that maintains a specific metric at a designated target value, scaling automatically as conditions shift
Step Scaling PolicyA policy with multiple thresholds that responds proportionally based on the severity of metric deviation
Health CheckA periodic verification determining whether an instance is functioning correctly
Grace PeriodThe interval the ASG waits before evaluating health checks on newly launched instances, allowing sufficient time for full initialization
Target GroupA logical collection of instances receiving traffic from a Load Balancer, with its own health check configuration
Elastic Load BalancerA managed service distributing incoming traffic across multiple EC2 instances spanning different Availability Zones
Connection DrainingThe interval during which a terminating instance stops receiving new requests while completing all in-flight connections

Essential AWS CLI Commands

Create AMI from existing instance
aws ec2 create-image \
    --instance-id i-1234567890abcdef0 \
    --name "myapp-v1.0.0" \
    --description "Production app with dependencies" \
    --no-reboot
Create Launch Template
aws ec2 create-launch-template \
    --launch-template-name myapp-template \
    --version-description "Initial version" \
    --launch-template-data '{
        "ImageId": "ami-0abc123",
        "InstanceType": "t3.micro",
        "SecurityGroupIds": ["sg-12345"],
        "UserData": "IyEvYmluL2Jhc2g...",
        "IamInstanceProfile": {"Name": "ec2-role"},
        "KeyName": "my-keypair"
    }'
Create new version of Launch Template
aws ec2 create-launch-template-version \
    --launch-template-id lt-0abc123 \
    --version-description "Updated to t3.small" \
    --launch-template-data '{"InstanceType": "t3.small"}'
Create Target Group
aws elbv2 create-target-group \
    --name myapp-targets \
    --protocol HTTP \
    --port 80 \
    --vpc-id vpc-12345 \
    --health-check-path /health \
    --health-check-interval-seconds 30 \
    --healthy-threshold-count 2 \
    --unhealthy-threshold-count 3
Create Application Load Balancer
aws elbv2 create-load-balancer \
    --name myapp-lb \
    --subnets subnet-abc123 subnet-def456 \
    --security-groups sg-12345 \
    --scheme internet-facing \
    --type application
Create Listener -- connects LB with Target Group
aws elbv2 create-listener \
    --load-balancer-arn arn:aws:elasticloadbalancing:... \
    --protocol HTTP \
    --port 80 \
    --default-actions Type=forward,TargetGroupArn=arn:aws:...
Create Auto Scaling Group
aws autoscaling create-auto-scaling-group \
    --auto-scaling-group-name myapp-asg \
    --launch-template LaunchTemplateId=lt-0abc123,Version='$Latest' \
    --min-size 2 \
    --max-size 10 \
    --desired-capacity 2 \
    --target-group-arns arn:aws:elasticloadbalancing:... \
    --vpc-zone-identifier "subnet-abc,subnet-def" \
    --health-check-type ELB \
    --health-check-grace-period 300
Create Target Tracking Scaling Policy
aws autoscaling put-scaling-policy \
    --auto-scaling-group-name myapp-asg \
    --policy-name target-tracking-cpu \
    --policy-type TargetTrackingScaling \
    --target-tracking-configuration '{
        "PredefinedMetricSpecification": {
            "PredefinedMetricType": "ASGAverageCPUUtilization"
        },
        "TargetValue": 70.0
    }'
View your own AMIs
aws ec2 describe-images --owners self
View Launch Templates
aws ec2 describe-launch-templates
View Launch Template versions
aws ec2 describe-launch-template-versions --launch-template-id lt-0abc123
View Auto Scaling Groups
aws autoscaling describe-auto-scaling-groups \
    --auto-scaling-group-names myapp-asg
View instances in the ASG
aws autoscaling describe-auto-scaling-groups \
    --auto-scaling-group-names myapp-asg \
    --query 'AutoScalingGroups[0].Instances[*].[InstanceId,LifecycleState,HealthStatus,AvailabilityZone]' \
    --output table
View ASG activities for troubleshooting
aws autoscaling describe-scaling-activities \
    --auto-scaling-group-name myapp-asg \
    --max-records 10
View target health
aws elbv2 describe-target-health \
    --target-group-arn arn:aws:elasticloadbalancing:...
View Load Balancers
aws elbv2 describe-load-balancers
Update ASG capacity limits
aws autoscaling update-auto-scaling-group \
    --auto-scaling-group-name myapp-asg \
    --min-size 3 \
    --max-size 15 \
    --desired-capacity 5
Update ASG Launch Template
aws autoscaling update-auto-scaling-group \
    --auto-scaling-group-name myapp-asg \
    --launch-template LaunchTemplateId=lt-0abc123,Version='$Latest'
Manually set desired capacity
aws autoscaling set-desired-capacity \
    --auto-scaling-group-name myapp-asg \
    --desired-capacity 8

Deleting an Auto Scaling Group with the --force-delete flag will terminate all managed instances immediately. Ensure you have verified that no critical traffic is being served before proceeding.

Delete Auto Scaling Group -- terminates all instances
aws autoscaling delete-auto-scaling-group \
    --auto-scaling-group-name myapp-asg \
    --force-delete
Delete Load Balancer
aws elbv2 delete-load-balancer \
    --load-balancer-arn arn:aws:elasticloadbalancing:...
Delete Target Group
aws elbv2 delete-target-group \
    --target-group-arn arn:aws:elasticloadbalancing:...
Delete Launch Template
aws ec2 delete-launch-template \
    --launch-template-id lt-0abc123
Deregister AMI
aws ec2 deregister-image --image-id ami-0abc123

Architecture

Complete Infrastructure Overview

Automatic Scaling Lifecycle

Health Check Verification Flow

Best Practices

Security

Never expose EC2 instances directly to the internet. All inbound traffic must route through the Load Balancer, with instance security groups configured to accept connections exclusively from the Load Balancer's security group.

  • Instances behind the Load Balancer: Configure security groups so instances only accept traffic from the Load Balancer, never directly from the internet
  • IAM Instance Profiles: Assign IAM roles to instances rather than hardcoding credentials under any circumstance
  • Secrets Management: Leverage AWS Secrets Manager or Parameter Store for sensitive variables — never embed them in user data
  • Layered Security Groups: Maintain separate security groups for the Load Balancer, which accepts internet traffic, and for instances, which accept only Load Balancer traffic
  • Secure Key Pairs: Rotate key pairs on a regular cadence and prefer AWS Systems Manager Session Manager over direct SSH access

Cost Optimization

  • Target Tracking over Step Scaling: Target Tracking policies scale more efficiently, provisioning only what is genuinely required
  • Appropriate minimum size: Avoid setting min-size higher than operational necessity demands
  • Scheduled Scaling: For workloads with predictable traffic patterns, define schedules to scale capacity up and down at known intervals
  • Appropriate instance types: Select free-tier-eligible instance types when applicable — t3.micro in sa-east-1, for example
  • Spot Instances: Consider blending On-Demand with Spot Instances for workloads tolerant of interruption

Performance

  • Multi-AZ deployment: Distribute instances across a minimum of two Availability Zones to ensure resilience
  • Health check interval: A 30-second interval strikes an effective balance between responsiveness and resource consumption
  • Sufficient grace period: Allow at least 5 minutes for applications with non-trivial startup sequences
  • Connection draining: Ensure in-progress requests complete gracefully before the ASG terminates an instance
  • Optimized AMIs: Pre-install all dependencies and application code in the AMI to minimize instance startup duration

Reliability

The /health endpoint should verify all critical downstream dependencies — database connectivity, cache availability, and any essential external services. A superficial health check that merely returns 200 OK without validation provides a dangerously false sense of confidence.

  • Desired capacity above one: Always maintain at least two instances to guarantee fault tolerance
  • ELB health check type: ELB-based health checks are substantially more reliable than EC2 health checks for web-facing applications
  • Unhealthy threshold: Configure a threshold of at least three consecutive failures to tolerate transient issues without premature instance termination
  • Metrics monitoring: Establish CloudWatch alarms for anomalous scaling activity and unhealthy instance counts

Operational Excellence

  • Versioned Launch Templates: Create new versions rather than modifying existing templates, preserving a complete audit trail
  • Consistent tagging: Apply tags to identify ASG instances by environment, project, and ownership
  • Centralized logging: Route application logs to CloudWatch Logs for streamlined troubleshooting across the fleet
  • Documented user data: Comment user data scripts thoroughly to support future maintenance
  • AMI validation: Test AMIs in isolation before deploying them to production Auto Scaling Groups

Common Mistakes

Cost Considerations

Resource Pricing Overview

ResourceApproximate CostNotes
Running EC2 instances~$0.0104/hour for t3.micro in sa-east-1Multiplied by the number of active instances
Application Load Balancer~0.0225/hourfixed+ 0.0225/hour fixed + ~0.008/LCU-hourLCU is determined by the highest of: new connections, active connections, processed bytes, rule evaluations
Data Transfer OUT~$0.15/GBFirst 1 GB free per month
Data Transfer between AZs~$0.01/GBWithin the same region only
Stored AMIs~$0.05/GB-monthAssociated EBS snapshots
Auto Scaling serviceFREEYou pay only for the resources it provisions

Cost Optimization Strategies

Scale with precision using Target Tracking:

Target Tracking -- scales only when necessary
--target-tracking-configuration TargetValue=70.0

Implement Scheduled Scaling for predictable workloads:

Reduce to minimum outside business hours
aws autoscaling put-scheduled-update-group-action \
    --auto-scaling-group-name myapp-asg \
    --scheduled-action-name scale-down-night \
    --recurrence "0 22 * * *" \
    --min-size 1 --desired-capacity 1

Eliminate unused resources aggressively: Deregister old AMIs and delete their associated snapshots. Tear down Load Balancers created for testing. Terminate development ASGs that are no longer actively used.

Leverage Reserved Instances or Savings Plans: If you maintain a predictable base load — for instance, a minimum of two instances running continuously — Reserved Instances can yield savings of up to 72%.

Free Tier Limits

TierAllowanceKey Constraint
EC2 Free Tier750 hours/month of t2.micro in us-east-1 or t3.micro in other regionsTwo t3.micro instances running 24/7 consume 1,440 hours/month, exceeding the free tier. One instance running 24/7 remains within the allowance.
ELB Free TierClassic Load Balancer: 750 hours/month + 15 GB processedApplication Load Balancers are not included in the free tier and always generate cost

For learning and experimentation, limit yourself to one or two t3.micro instances within the free tier allowance. Delete all resources immediately after practice sessions. Consider testing without a Load Balancer — using direct instance IPs — to avoid ALB charges entirely.

Integration with Other Services

ServiceIntegration MechanismTypical Use Case
Route53DNS records pointing to the Load BalancerMap www.myapp.com to the ALB DNS name with automatic failover
CloudWatchCollects instance and ASG metricsMonitor CPU and memory utilization, create alarms, centralize logs
IAMInstance Profiles assigned via Launch TemplateGrant instances permissions to access S3, DynamoDB, Secrets Manager
VPCASG launches instances in designated subnetsMulti-AZ deployment for high availability
RDSInstances connect to a shared databaseStateless applications reading from and writing to a centralized relational database
ElastiCacheInstances leverage a shared cache layerSession storage, frequently queried data caching
S3Instances upload and download objectsUser uploads, application logs, static asset storage
Secrets ManagerUser data retrieves secrets at startupDatabase passwords and API keys without hardcoding
CloudFormationDefines entire infrastructure as codeDeploy a complete stack — VPC, ASG, Load Balancer, RDS — with a single template
CodeDeployAutomated deployments to ASG instancesCI/CD pipelines: push code, CodeDeploy updates instances without downtime
SNSASG publishes event notificationsAlerts when instances are created, terminated, or when scaling events occur
LambdaTriggers on ASG lifecycle eventsExecute custom logic when an instance launches — register in an external service, for example

Additional Resources

Official Documentation

Relevant Whitepapers