Amazon RDS Essentials | Geovanni Mena

RDS essentials, storage types, backups, Multi-AZ, Read Replicas, and best practices

Key Concepts Essential AWS CLI Commands Creating and Querying DB Instances Modifying DB Instances Deleting DB Instances Manual Snapshots Point-in-Time and Snapshot Restore CloudWatch Monitoring Logs Access Architecture and Flows Typical Architecture with RDS Point-in-Time Recovery Flow Storage Type Decision Tree Multi-AZ Failover Flow Best Practices Checklist Security Cost Optimization Performance Reliability Operational Excellence Common Mistakes to Avoid Backup Retention Period = 0 Choosing io2 Without Analyzing CloudWatch Metrics First Not Creating Manual Snapshot Before Risky Changes Publicly Accessible = True in Production Not Monitoring FreeableMemory and SwapUsage Not Testing Restore Process Until Real Disaster Storage Full Without Autoscaling Configured Cost Considerations Cost Components in RDS Real Application Cost Example Optimization Strategies Integration with Other Services Additional Resources Official AWS Documentation Whitepapers and Best Practices Tutorials and Workshops Engine-Specific Resources For AWS Solutions Architect Associate Certification

Amazon RDS (Relational Database Service) is a managed relational database service that automates administrative tasks like provisioning, patching, backups, recovery, and scaling. It supports multiple engines: PostgreSQL, MySQL, MariaDB, Oracle, SQL Server, and Amazon Aurora.

The problem it solves: It eliminates the operational burden of managing databases - no more manual installation, backup configuration, infrastructure monitoring, OS/DB patching, or failover management. You focus on your schema and queries, AWS handles everything else.

When to use it: For any application requiring a relational database in production. RDS vs EC2+manual DB = 10-20 hours/month saved on administration + automatic backups + point-in-time recovery included. Alternatives: Aurora (more AWS-native features, 5x performance), self-managed on EC2 (only if you need very specific configurations not supported), DynamoDB (if you prefer NoSQL).

Key Concepts

Concept	Description
DB Instance	Managed database server with compute, storage, and networking. It's the main RDS component - similar to an EC2 specialized for databases
Instance Class	Compute type for your DB (CPU, RAM, network). Families: T (burstable, dev/test), M (general purpose, balanced production), R (memory optimized, analytics), X (extra memory, in-memory workloads)
Storage Type	Disk type for persistence. gp3 (General Purpose SSD) for 90% of cases, io2 (Provisioned IOPS) for I/O-intensive workloads with strict SLA, Magnetic (legacy, avoid)
IOPS	Input/Output Operations Per Second - disk performance measure. gp3 baseline = 3,000 IOPS, configurable up to 16,000. io2 up to 256,000 IOPS
Allocated Storage	Disk space assigned to the DB. Minimum 20 GB (gp3), maximum 64 TB. Can grow automatically with storage autoscaling
Automated Backup	Daily full snapshot + continuous transaction logs every 5 minutes. Enables point-in-time recovery within the retention period (1-35 days)
Backup Retention Period	Days RDS retains automated backups. Default 7 days, maximum 35 days. Backups free up to DB size, then $0.095/GB-month
Point-in-Time Recovery (PITR)	Ability to restore DB to any specific second within the retention period. Useful for recovery from human errors (DROP TABLE, DELETE without WHERE)
Manual Snapshot	Explicit backup you create. Persists indefinitely (until you delete it), survives if you delete the DB instance. You pay $0.095/GB-month for storage
DB Endpoint	DNS hostname to connect to the DB (e.g., `mydb.abc123.sa-east-1.rds.amazonaws.com`). Doesn't change during instance lifetime (except for rename)
Master Username/Password	DB administrator user credentials. Configurable at creation, modifiable afterward via CLI/Console
DB Subnet Group	Set of subnets where RDS can launch DB instances. Minimum 2 subnets in different AZs (for Multi-AZ support)
Security Group	Firewall controlling which IPs/security groups can connect to your DB. Typically only allow connections from your app servers (EC2/ECS/Lambda)
Multi-AZ Deployment	High availability configuration with automatic standby replica in another AZ. Automatic failover in ~1-2 minutes on hardware/AZ failure
Read Replica	Read-only copy of your DB to scale reads. Asynchronous replication, can be in same region or cross-region. Useful for reports/analytics without impacting production
Maintenance Window	Weekly time window where AWS can apply patches/updates. Configurable, recommended during low traffic hours
Engine Version	Specific DB engine version (e.g., PostgreSQL 15.4). AWS handles minor version upgrades automatically, major versions require manual upgrade

Essential AWS CLI Commands

Creating and Querying DB Instances

# Create DB Instance (PostgreSQL)
aws rds create-db-instance \
    --db-instance-identifier myapp-db \
    --db-instance-class db.t3.micro \
    --engine postgres \
    --engine-version 15.4 \
    --master-username admin \
    --master-user-password SecurePassword123! \
    --allocated-storage 20 \
    --storage-type gp3 \
    --backup-retention-period 7 \
    --preferred-backup-window "03:00-04:00" \
    --vpc-security-group-ids sg-0abc123 \
    --db-subnet-group-name my-db-subnet-group \
    --no-publicly-accessible \
    --storage-encrypted \
    --tags Key=Environment,Value=Production

# List all DB instances
aws rds describe-db-instances

# View specific instance details
aws rds describe-db-instances \
    --db-instance-identifier myapp-db \
    --query 'DBInstances[0].{
        Status:DBInstanceStatus,
        Endpoint:Endpoint.Address,
        Engine:Engine,
        Class:DBInstanceClass,
        Storage:AllocatedStorage,
        IOPS:Iops
    }'

# Verify if available
aws rds describe-db-instances \
    --db-instance-identifier myapp-db \
    --query 'DBInstances[0].DBInstanceStatus' \
    --output text

# Get connection endpoint
aws rds describe-db-instances \
    --db-instance-identifier myapp-db \
    --query 'DBInstances[0].Endpoint.Address' \
    --output text

Modifying DB Instances

# Change instance class (upgrade/downgrade)
aws rds modify-db-instance \
    --db-instance-identifier myapp-db \
    --db-instance-class db.t3.small \
    --apply-immediately

# Increase storage
aws rds modify-db-instance \
    --db-instance-identifier myapp-db \
    --allocated-storage 50 \
    --apply-immediately

# Increase IOPS (gp3)
aws rds modify-db-instance \
    --db-instance-identifier myapp-db \
    --iops 6000 \
    --apply-immediately

# Change storage type (gp3 to io2)
aws rds modify-db-instance \
    --db-instance-identifier myapp-db \
    --storage-type io2 \
    --iops 10000 \
    --apply-immediately
# Warning: Causes downtime (~10-30 min)

# Change backup retention period
aws rds modify-db-instance \
    --db-instance-identifier myapp-db \
    --backup-retention-period 14 \
    --apply-immediately

# Rename DB instance
aws rds modify-db-instance \
    --db-instance-identifier myapp-db \
    --new-db-instance-identifier myapp-db-v2 \
    --apply-immediately

Deleting DB Instances

# Delete DB instance WITH final snapshot
aws rds delete-db-instance \
    --db-instance-identifier myapp-db \
    --final-db-snapshot-identifier myapp-db-final-snapshot

# Delete DB instance WITHOUT snapshot (not recommended in prod)
aws rds delete-db-instance \
    --db-instance-identifier myapp-db \
    --skip-final-snapshot

# Verify deletion
aws rds describe-db-instances \
    --db-instance-identifier myapp-db \
    --query 'DBInstances[0].DBInstanceStatus'

Manual Snapshots

# Create manual snapshot
aws rds create-db-snapshot \
    --db-instance-identifier myapp-db \
    --db-snapshot-identifier myapp-pre-migration-2025-11-30

# List snapshots
aws rds describe-db-snapshots \
    --db-instance-identifier myapp-db

# View manual snapshots only
aws rds describe-db-snapshots \
    --db-instance-identifier myapp-db \
    --snapshot-type manual

# View automated snapshots only
aws rds describe-db-snapshots \
    --db-instance-identifier myapp-db \
    --snapshot-type automated

# Delete manual snapshot
aws rds delete-db-snapshot \
    --db-snapshot-identifier myapp-pre-migration-2025-11-30

# Copy snapshot to another region (disaster recovery)
aws rds copy-db-snapshot \
    --source-db-snapshot-identifier arn:aws:rds:sa-east-1:123456789012:snapshot:myapp-snapshot \
    --target-db-snapshot-identifier myapp-snapshot-us-east-1 \
    --region us-east-1

Point-in-Time and Snapshot Restore

# Point-in-Time Restore (any minute within retention period)
aws rds restore-db-instance-to-point-in-time \
    --source-db-instance-identifier myapp-db \
    --target-db-instance-identifier myapp-db-restored \
    --restore-time "2025-11-30T15:06:00Z" \
    --vpc-security-group-ids sg-0abc123

# Restore from manual snapshot
aws rds restore-db-instance-from-db-snapshot \
    --db-instance-identifier myapp-db-from-snapshot \
    --db-snapshot-identifier myapp-pre-migration-2025-11-30 \
    --db-instance-class db.t3.micro

# View available restore points
aws rds describe-db-instances \
    --db-instance-identifier myapp-db \
    --query 'DBInstances[0].{
        EarliestRestorableTime:EarliestRestorableTime,
        LatestRestorableTime:LatestRestorableTime
    }'

CloudWatch Monitoring

# View CPU utilization (last 24 hours)
aws cloudwatch get-metric-statistics \
    --namespace AWS/RDS \
    --metric-name CPUUtilization \
    --dimensions Name=DBInstanceIdentifier,Value=myapp-db \
    --start-time $(date -u -d '24 hours ago' +%Y-%m-%dT%H:%M:%S) \
    --end-time $(date -u +%Y-%m-%dT%H:%M:%S) \
    --period 3600 \
    --statistics Average,Maximum

# View database connections
aws cloudwatch get-metric-statistics \
    --namespace AWS/RDS \
    --metric-name DatabaseConnections \
    --dimensions Name=DBInstanceIdentifier,Value=myapp-db \
    --period 300 \
    --statistics Average

# View Read/Write IOPS
aws cloudwatch get-metric-statistics \
    --namespace AWS/RDS \
    --metric-name ReadIOPS \
    --dimensions Name=DBInstanceIdentifier,Value=myapp-db \
    --period 300 \
    --statistics Average

# View disk queue depth (if high, you need more IOPS)
aws cloudwatch get-metric-statistics \
    --namespace AWS/RDS \
    --metric-name DiskQueueDepth \
    --dimensions Name=DBInstanceIdentifier,Value=myapp-db \
    --period 300 \
    --statistics Average

Logs Access

# List available log files
aws rds describe-db-log-files \
    --db-instance-identifier myapp-db

# Download complete log file
aws rds download-db-log-file-portion \
    --db-instance-identifier myapp-db \
    --log-file-name error/postgresql.log.2025-11-30-15 \
    --output text

# View last lines of log (tail)
aws rds download-db-log-file-portion \
    --db-instance-identifier myapp-db \
    --log-file-name error/postgresql.log \
    --starting-token 0 \
    --output text | tail -n 100

Architecture and Flows

Typical Architecture with RDS

Point-in-Time Recovery Flow

Storage Type Decision Tree

Multi-AZ Failover Flow

Best Practices Checklist

Security

Never use publicly-accessible in production: DB instances should be in private subnets, accessible only from app servers' security group
Encryption at rest enabled: --storage-encrypted at creation (can't be enabled afterward)
SSL/TLS for connections: Force SSL in DB parameters, use RDS certificates in app
Secrets Manager for credentials: Don't hardcode passwords, use automatic rotation
IAM Database Authentication: For applications that support it (eliminates need for passwords)
Restrictive security groups: Only necessary ports (5432 for PostgreSQL) from specific IPs/SGs
Audit logging enabled: Enable log_statement, log_connections for PostgreSQL
Encrypted snapshots: Manual snapshots inherit encryption from DB source

Cost Optimization

Right-sizing instance class: Monitor CPU/Memory in CloudWatch, downgrade if sustained utilization under 40%
gp3 over gp2: gp3 is cheaper and more flexible (3,000 IOPS baseline independent of storage size)
Storage autoscaling configured: Avoids running out of space, grows only when needed
Appropriate backup retention: 7 days for dev/test, 14-30 days for prod, not more than necessary
Delete old snapshots: Manual snapshots cost $0.095/GB-month indefinitely
Reserved Instances for stable production: 1-year RI = ~40% savings, 3-year = ~60% savings
Dev/test instances stopped off-hours: Use scheduled Lambda/EventBridge for stop/start
Read Replicas in same region when possible: Cross-region replicas have data transfer costs

Performance

CloudWatch alarms configured: CPU over 80%, FreeableMemory under 500MB, DiskQueueDepth over 5
Monitor IOPS utilization: If consistently over 80% baseline, increase provisioned IOPS
Connection pooling in application: PgBouncer/RDS Proxy to handle many connections efficiently
Appropriate indexes: EXPLAIN ANALYZE slow queries, add indexes where needed
Read Replicas for read-heavy workloads: Offload reports/analytics to replica
Optimized parameter group: Adjust shared_buffers, work_mem, effective_cache_size per workload
Upgrade to modern instances: M6i/R6i more performance per $ than previous generations

Reliability

Multi-AZ enabled in production: Automatic failover in ~1-2 min on hardware/AZ failure
Automated backups enabled: Minimum 7 days retention, NEVER backup-retention-period = 0
Manual snapshot before risky changes: Migrations, upgrades, major schema changes
Test restore process quarterly: Disaster recovery drill, measure recovery time
Maintenance window configured: Low traffic hours (early morning/weekend)
CloudWatch Events for alerts: Notify when failover occurs, backups fail, storage low
RDS Proxy for faster failover: Reduces connection storm post-failover

Operational Excellence

Consistent tagging: Environment, Application, Owner, CostCenter for tracking
Infrastructure as Code: Terraform/CloudFormation for DB instances, parameter groups, subnet groups
Snapshot before deleting: Final snapshot or verify backups before delete-db-instance
Naming conventions: {app}-{env}-{region} e.g., myapp-prod-sa-east-1
Enhanced Monitoring enabled: OS metrics (processes, threads, memory detail) every 60s
Performance Insights enabled: Query-level metrics, wait event analysis (first 7 days free)
Document custom configurations: Modified parameter groups, security group rules, etc.

Common Mistakes to Avoid

Backup Retention Period = 0

Why it happens: "I want to save costs, I'll disable automated backups."

The real problem: Automated backups up to your DB size are FREE. Disabling them saves $0 but leaves you unprotected against disasters.

Typical scenario:

# BAD: Disable backups
aws rds modify-db-instance \
    --db-instance-identifier myapp-db \
    --backup-retention-period 0

# Savings: $0 (backups are free up to DB size)
# Risk: Total data loss if DROP TABLE, DELETE without WHERE, etc.

How to avoid it:

# GOOD: Minimum 7 days in production
aws rds modify-db-instance \
    --db-instance-identifier myapp-db \
    --backup-retention-period 7

# For compliance: 14-30 days
--backup-retention-period 30

Golden rule: NEVER backup-retention-period = 0 in production. Only acceptable in temporary testing environments.

Choosing io2 Without Analyzing CloudWatch Metrics First

Why it happens: "More IOPS = better performance, let's go with io2."

The real problem: io2 is 10-15x more expensive than gp3. Most workloads work perfectly fine with gp3.

Before considering io2:

# 1. Check current IOPS utilization
aws cloudwatch get-metric-statistics \
    --namespace AWS/RDS \
    --metric-name ReadIOPS \
    --dimensions Name=DBInstanceIdentifier,Value=myapp-db \
    --period 300 \
    --statistics Average,Maximum

# 2. If Average under 2,500 IOPS -> gp3 baseline (3K) is sufficient
# 3. If Average 5,000-10,000 -> increase IOPS in gp3 first
# 4. Only if over 16,000 sustained IOPS -> consider io2

Cost comparison (100 GB, 10,000 IOPS):

gp3: $11.50 (storage) + $35 (7K IOPS extra) = $46.50/month
io2: $13.80 (storage) + $650 (10K IOPS) = $663.80/month

Difference: 14x more expensive

When to USE io2: Financial applications (trading), latency under 1ms critical, 99.99%+ SLA where every millisecond counts.

Not Creating Manual Snapshot Before Risky Changes

Why it happens: "Automated backups are enough."

The real problem: Automated backups are every 24 hours (typically 3 AM). If you do a migration at 10 AM and it fails, the last backup is from 7 hours ago.

Data loss scenario:

# Thursday 10 AM: Schema migration
ALTER TABLE users ADD COLUMN preferences JSONB;
# Migration corrupts data

# Last automated backup: Thursday 3 AM (7 hours ago)
# Point-in-time restore: You lose 7 hours of legitimate transactions

How to avoid it:

# ALWAYS before major changes:
aws rds create-db-snapshot \
    --db-instance-identifier myapp-db \
    --db-snapshot-identifier pre-migration-$(date +%Y%m%d-%H%M)

# Do risky change
# If it fails -> restore from manual snapshot (minutes lost, not hours)

# After confirming success:
aws rds delete-db-snapshot \
    --db-snapshot-identifier pre-migration-20251130-1000

When to create manual snapshots:

Schema migrations
Major version upgrades
Bulk data modifications
Deployment of critical changes

Publicly Accessible = True in Production

Why it happens: "I need to connect from my laptop for debugging."

The real problem: You expose your DB to the internet. Bots constantly scan for open DBs, attempting brute force.

# BAD: DB accessible from internet
aws rds create-db-instance \
    --publicly-accessible
    # Security group allows 0.0.0.0/0 -> DB exposed

# Result: You appear on Shodan, constant brute force attempts

How to avoid it:

# GOOD: DB in private subnet, not publicly accessible
aws rds create-db-instance \
    --no-publicly-accessible \
    --vpc-security-group-ids sg-private-db

# Security group allows ONLY from app servers:
aws ec2 authorize-security-group-ingress \
    --group-id sg-private-db \
    --protocol tcp \
    --port 5432 \
    --source-group sg-app-servers

# For debugging from laptop:
# Option 1: Bastion host in public subnet
# Option 2: VPN/Direct Connect
# Option 3: SSM Session Manager port forwarding

Not Monitoring FreeableMemory and SwapUsage

Why it happens: "CloudWatch shows CPU under 50%, must be fine."

The real problem: Normal CPU but saturated memory = slow queries, swapping to disk (100x slower than RAM).

Symptoms you're ignoring:

# CPU: 45% (seems OK)
# FreeableMemory: 50 MB (of 1 GB total) - PROBLEM
# SwapUsage: 500 MB - SWAP ACTIVE = SLOW
# Latency: 2000ms (should be under 50ms)

Correct diagnosis:

# View memory metrics
aws cloudwatch get-metric-statistics \
    --namespace AWS/RDS \
    --metric-name FreeableMemory \
    --dimensions Name=DBInstanceIdentifier,Value=myapp-db \
    --period 300 \
    --statistics Average

# If FreeableMemory consistently under 500 MB:
# -> You need instance class upgrade (more RAM)

aws rds modify-db-instance \
    --db-instance-identifier myapp-db \
    --db-instance-class db.t3.small \
    --apply-immediately

Preventive alarm:

aws cloudwatch put-metric-alarm \
    --alarm-name myapp-db-low-memory \
    --metric-name FreeableMemory \
    --namespace AWS/RDS \
    --statistic Average \
    --period 300 \
    --evaluation-periods 2 \
    --threshold 524288000 \
    --comparison-operator LessThanThreshold \
    --dimensions Name=DBInstanceIdentifier,Value=myapp-db

Not Testing Restore Process Until Real Disaster

Why it happens: "The configuration looks good, it must work when I need it."

The real problem: Disaster arrives, you discover that:

You don't know how to use restore CLI correctly
Snapshot is corrupted
Process takes longer than expected
App doesn't connect to restored DB

Time lost in production: 2-3 hours troubleshooting in panic vs 20 min if you had practiced.

How to avoid it - Quarterly Disaster Recovery Drill:

# Every 3 months (Friday afternoon, low traffic):

# 1. Create snapshot
aws rds create-db-snapshot \
    --db-instance-identifier myapp-prod \
    --db-snapshot-identifier dr-drill-2025-q4

# 2. Measure time: Restore
time aws rds restore-db-instance-from-db-snapshot \
    --db-instance-identifier myapp-prod-dr-test \
    --db-snapshot-identifier dr-drill-2025-q4

# 3. Verify connectivity
psql -h myapp-prod-dr-test.xxx.rds.amazonaws.com \
     -U admin -d myapp -c "SELECT COUNT(*) FROM users;"

# 4. Document:
# - Total restore time: X minutes
# - Issues found
# - Update runbook

# 5. Clean up
aws rds delete-db-instance \
    --db-instance-identifier myapp-prod-dr-test \
    --skip-final-snapshot

aws rds delete-db-snapshot \
    --db-snapshot-identifier dr-drill-2025-q4

Storage Full Without Autoscaling Configured

Why it happens: "I have 20 GB, takes months to fill up, I'll check when it gets close."

The real problem: Friday 6 PM, storage reaches 100%, DB enters read-only mode, app stops working.

# DB state when storage is full:
Status: "storage-full"
# Queries: Only SELECT works
# INSERT/UPDATE/DELETE: ERROR

How to avoid it - Storage Autoscaling:

aws rds modify-db-instance \
    --db-instance-identifier myapp-db \
    --max-allocated-storage 100 \
    --apply-immediately

# RDS automatically increases storage when:
# - Free space under 10% of allocated storage
# - Or under 6 GB free
# - Low-storage lasts at least 5 minutes
# - At least 6 hours since last storage modification

Preventive alarm:

aws cloudwatch put-metric-alarm \
    --alarm-name myapp-db-low-storage \
    --metric-name FreeStorageSpace \
    --namespace AWS/RDS \
    --statistic Average \
    --period 300 \
    --evaluation-periods 1 \
    --threshold 2147483648 \
    --comparison-operator LessThanThreshold \
    --dimensions Name=DBInstanceIdentifier,Value=myapp-db

Cost Considerations

Cost Components in RDS

Component	Pricing	Free Tier
DB Instance (compute)	Per hour based on instance class	750 hrs/month db.t3.micro (12 months)
Storage - gp3	$0.115/GB-month	20 GB (12 months)
Storage - io2	$0.138/GB-month +$ 0.065/IOPS	Not included
Provisioned IOPS (gp3)	$0.005/IOPS above 3,000	Not included
Automated Backups	Free up to DB size, then $0.095/GB-month	Included
Manual Snapshots	$0.095/GB-month	Not included
Data Transfer Out	$0.09/GB (to internet)	1 GB/month (12 months)
Multi-AZ (standby replica)	2x cost of DB instance + storage	Not included
Read Replica	Separate DB instance + storage cost	Not included

Real Application Cost Example

Configuration:

Instance: db.t3.small (2 vCPU, 2 GB RAM)
Storage: 100 GB gp3, 5,000 IOPS (2,000 extra)
Backups: 7 days retention (~150 GB backups total)
Manual snapshots: 2 snapshots of 100 GB each
Multi-AZ: Yes
Region: sa-east-1

Monthly calculation:

DB Instance (primary):
  db.t3.small x 730 hrs = $30/month

DB Instance (standby - Multi-AZ):
  db.t3.small x 730 hrs = $30/month

Storage (primary):
  100 GB gp3 x $0.115 = $11.50/month

Storage (standby):
  100 GB gp3 x $0.115 = $11.50/month

Provisioned IOPS extra:
  2,000 IOPS x $0.005 = $10/month (primary only)

Automated Backups:
  First 100 GB free (= DB size)
  Excess: 50 GB x $0.095 = $4.75/month

Manual Snapshots:
  200 GB x $0.095 = $19/month

TOTAL MONTHLY: $116.75/month

Optimization Strategies

1. Reserved Instances for stable production

db.t3.small on-demand: $30/month x 12 = $360/year

Reserved Instance (1 year, No Upfront):
  ~$216/year (40% savings)

Reserved Instance (3 years, All Upfront):
  ~$140/year (60% savings)

Recommendation: 1-year RI for prod DBs you know will run over 1 year

2. Rightsizing with CloudWatch Metrics

# If I see CPU under 30% and Memory under 50% for 2+ weeks:
# Downgrade from db.t3.small -> db.t3.micro

Savings: $30/month x 2 (primary + standby) = $60/month

3. Optimized backup retention

Automated backups:
  7 days (dev/test)
  14 days (prod - cost/recovery balance)
  30 days (compliance only if required)

Manual snapshots:
  Only for compliance/long-term
  Delete after verifying no longer needed

4. Dev/Test instances stopped off-hours

# Lambda scheduled (EventBridge):
# Monday-Friday 8 AM: Start DB
# Monday-Friday 6 PM: Stop DB

Savings: ~60% of instance cost
db.t3.small: $30/month -> $12/month (12 hrs/day vs 24 hrs)

Integration with Other Services

AWS Service	How It Integrates	Typical Use Case
EC2	App servers on EC2 connect to RDS via endpoint	Backend apps (Laravel, Django, Rails) using RDS as datastore
Lambda	Lambda functions connect to RDS (via VPC or RDS Proxy)	Serverless APIs, scheduled jobs reading/writing to DB
VPC	RDS instances live in VPC subnets, security groups control access	Network isolation, private subnets for DBs, granular traffic control
Secrets Manager	Stores DB credentials, automatic rotation	Don't hardcode passwords, automatic credential rotation every 30-90 days
CloudWatch	Automatic metrics (CPU, connections, IOPS), logs, alarms	Performance monitoring, alerts when thresholds exceeded
CloudTrail	Audit trail of RDS operations (create, modify, delete)	Compliance, forensics, "who deleted the DB on Friday?"
S3	Snapshot export, long-term backups, query results (Athena)	Cross-region disaster recovery, compliance archives, analytics on exported data
IAM	Control who can manage RDS, IAM database authentication	Principle of least privilege, developers only read access to prod DBs
KMS	Encryption keys for storage at rest, snapshot encryption	Compliance (HIPAA, PCI-DSS), encrypted databases
EventBridge	RDS events (backup complete, failover, low storage)	Automation (notify Slack on failover, trigger Lambda on backup failure)
SNS	Target of CloudWatch alarms for notifications	Email/SMS when CPU over 80%, storage under 10%, connections maxed
Route53	Health checks of RDS endpoint, failover DNS	Multi-region DR, automatic DNS failover if primary region fails
DMS	Migrate data to RDS from on-premise or other DBs	Lift-and-shift migrations, ongoing replication, zero-downtime migrations
ECS/Fargate	Containerized apps connect to RDS	Microservices architecture, each service with its own DB connection pool
ElastiCache	Cache layer in front of RDS to reduce DB load	Session storage, query result caching, hot data acceleration
RDS Proxy	Managed connection pooler between app and RDS	Lambda functions (avoid connection exhaustion), faster failover