Infrastructure as Code (IaC) isn't just for enterprise teams. As a startup, you can build production-ready AWS infrastructure in a weekend using Terraform's reusable modules. This pragmatic approach helps you scale fast while maintaining reliability and cost control.
Table of Contents
- Why Startups Need IaC From Day One
- Weekend Roadmap
- Day 1: Building the Foundation
- Day 2: Application Infrastructure
- CloudWatch Monitoring and Alerts
- Deployment Commands
- Cost Optimization Tips
- Security Best Practices
- Production Considerations
- Conclusion
Why Startups Need IaC From Day One
Many startups delay infrastructure automation, thinking it's premature optimization. This is a costly mistake. IaC provides:
- Reproducible environments - Dev, staging, and prod are identical
- Version control - Infrastructure changes are tracked and reviewable
- Cost optimization - Resources are defined explicitly, preventing drift
- Team scaling - New developers can spin up environments instantly
- Disaster recovery - Rebuild your entire stack with one command
Weekend Roadmap
Day 1: Foundation & VPC
- Set up Terraform workspace
- Build networking foundation
- Configure security groups and NACLs
Day 2: Application Infrastructure
- Deploy ECS Fargate cluster
- Set up RDS database
- Configure monitoring and alerts
Day 1: Building the Foundation
Project Structure
Start with a clean, modular structure:
terraform/ ├── environments/ │ ├── dev/ │ ├── staging/ │ └── prod/ ├── modules/ │ ├── vpc/ │ ├── ecs/ │ └── rds/ ├── shared/ │ └── variables.tf └── README.md
VPC Module (modules/vpc/main.tf
)
variable "environment" { description = "Environment name" type = string } variable "vpc_cidr" { description = "CIDR block for VPC" type = string default = "10.0.0.0/16" } variable "availability_zones" { description = "Availability zones" type = list(string) default = ["us-east-1a", "us-east-1b"] } # VPC resource "aws_vpc" "main" { cidr_block = var.vpc_cidr enable_dns_hostnames = true enable_dns_support = true tags = { Name = "${var.environment}-vpc" Environment = var.environment } } # Internet Gateway resource "aws_internet_gateway" "main" { vpc_id = aws_vpc.main.id tags = { Name = "${var.environment}-igw" Environment = var.environment } } # Public Subnets resource "aws_subnet" "public" { count = length(var.availability_zones) vpc_id = aws_vpc.main.id cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index) availability_zone = var.availability_zones[count.index] map_public_ip_on_launch = true tags = { Name = "${var.environment}-public-${count.index + 1}" Environment = var.environment Type = "Public" } } # Private Subnets resource "aws_subnet" "private" { count = length(var.availability_zones) vpc_id = aws_vpc.main.id cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index + 100) availability_zone = var.availability_zones[count.index] tags = { Name = "${var.environment}-private-${count.index + 1}" Environment = var.environment Type = "Private" } } # NAT Gateways resource "aws_eip" "nat" { count = length(aws_subnet.public) domain = "vpc" depends_on = [aws_internet_gateway.main] tags = { Name = "${var.environment}-nat-eip-${count.index + 1}" Environment = var.environment } } resource "aws_nat_gateway" "main" { count = length(aws_subnet.public) allocation_id = aws_eip.nat[count.index].id subnet_id = aws_subnet.public[count.index].id tags = { Name = "${var.environment}-nat-${count.index + 1}" Environment = var.environment } depends_on = [aws_internet_gateway.main] } # Route Tables resource "aws_route_table" "public" { vpc_id = aws_vpc.main.id route { cidr_block = "0.0.0.0/0" gateway_id = aws_internet_gateway.main.id } tags = { Name = "${var.environment}-public-rt" Environment = var.environment } } resource "aws_route_table" "private" { count = length(aws_nat_gateway.main) vpc_id = aws_vpc.main.id route { cidr_block = "0.0.0.0/0" nat_gateway_id = aws_nat_gateway.main[count.index].id } tags = { Name = "${var.environment}-private-rt-${count.index + 1}" Environment = var.environment } } # Route Table Associations resource "aws_route_table_association" "public" { count = length(aws_subnet.public) subnet_id = aws_subnet.public[count.index].id route_table_id = aws_route_table.public.id } resource "aws_route_table_association" "private" { count = length(aws_subnet.private) subnet_id = aws_subnet.private[count.index].id route_table_id = aws_route_table.private[count.index].id } # Security Group for ALB resource "aws_security_group" "alb" { name_prefix = "${var.environment}-alb-" vpc_id = aws_vpc.main.id ingress { description = "HTTP" from_port = 80 to_port = 80 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] } ingress { description = "HTTPS" from_port = 443 to_port = 443 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] } egress { from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] } tags = { Name = "${var.environment}-alb-sg" Environment = var.environment } lifecycle { create_before_destroy = true } } # Security Group for ECS resource "aws_security_group" "ecs" { name_prefix = "${var.environment}-ecs-" vpc_id = aws_vpc.main.id ingress { description = "HTTP from ALB" from_port = 80 to_port = 80 protocol = "tcp" security_groups = [aws_security_group.alb.id] } egress { from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] } tags = { Name = "${var.environment}-ecs-sg" Environment = var.environment } lifecycle { create_before_destroy = true } } # Outputs output "vpc_id" { description = "ID of the VPC" value = aws_vpc.main.id } output "public_subnet_ids" { description = "IDs of the public subnets" value = aws_subnet.public[*].id } output "private_subnet_ids" { description = "IDs of the private subnets" value = aws_subnet.private[*].id } output "alb_security_group_id" { description = "ID of the ALB security group" value = aws_security_group.alb.id } output "ecs_security_group_id" { description = "ID of the ECS security group" value = aws_security_group.ecs.id }
Development Environment (environments/dev/main.tf
)
terraform { required_version = ">= 1.0" required_providers { aws = { source = "hashicorp/aws" version = "~> 5.0" } } } provider "aws" { region = var.aws_region } # Variables variable "aws_region" { description = "AWS region" type = string default = "us-east-1" } variable "environment" { description = "Environment name" type = string default = "dev" } # VPC Module module "vpc" { source = "../../modules/vpc" environment = var.environment vpc_cidr = "10.0.0.0/16" availability_zones = ["us-east-1a", "us-east-1b"] } # Outputs output "vpc_id" { value = module.vpc.vpc_id }
Day 2: Application Infrastructure
ECS Fargate Module (modules/ecs/main.tf
)
variable "environment" { description = "Environment name" type = string } variable "vpc_id" { description = "VPC ID" type = string } variable "private_subnet_ids" { description = "Private subnet IDs" type = list(string) } variable "public_subnet_ids" { description = "Public subnet IDs" type = list(string) } variable "ecs_security_group_id" { description = "ECS security group ID" type = string } variable "alb_security_group_id" { description = "ALB security group ID" type = string } variable "app_name" { description = "Application name" type = string default = "webapp" } variable "app_port" { description = "Application port" type = number default = 3000 } variable "desired_count" { description = "Desired number of tasks" type = number default = 2 } variable "cpu" { description = "CPU units" type = number default = 256 } variable "memory" { description = "Memory in MB" type = number default = 512 } # ECS Cluster resource "aws_ecs_cluster" "main" { name = "${var.environment}-cluster" setting { name = "containerInsights" value = "enabled" } tags = { Name = "${var.environment}-cluster" Environment = var.environment } } # ECS Task Definition resource "aws_ecs_task_definition" "app" { family = "${var.environment}-${var.app_name}" execution_role_arn = aws_iam_role.ecs_task_execution_role.arn task_role_arn = aws_iam_role.ecs_task_role.arn network_mode = "awsvpc" requires_compatibilities = ["FARGATE"] cpu = var.cpu memory = var.memory container_definitions = jsonencode([ { name = var.app_name image = "nginx:latest" # Replace with your app image portMappings = [ { containerPort = var.app_port protocol = "tcp" } ] logConfiguration = { logDriver = "awslogs" options = { awslogs-group = aws_cloudwatch_log_group.app.name awslogs-region = data.aws_region.current.name awslogs-stream-prefix = "ecs" } } environment = [ { name = "ENVIRONMENT" value = var.environment } ] } ]) tags = { Name = "${var.environment}-${var.app_name}" Environment = var.environment } } # Application Load Balancer resource "aws_lb" "main" { name = "${var.environment}-alb" internal = false load_balancer_type = "application" security_groups = [var.alb_security_group_id] subnets = var.public_subnet_ids enable_deletion_protection = false tags = { Name = "${var.environment}-alb" Environment = var.environment } } resource "aws_lb_target_group" "app" { name = "${var.environment}-${var.app_name}-tg" port = var.app_port protocol = "HTTP" vpc_id = var.vpc_id target_type = "ip" health_check { enabled = true healthy_threshold = "3" interval = "30" matcher = "200" path = "/" port = "traffic-port" protocol = "HTTP" timeout = "5" unhealthy_threshold = "2" } tags = { Name = "${var.environment}-${var.app_name}-tg" Environment = var.environment } } resource "aws_lb_listener" "front_end" { load_balancer_arn = aws_lb.main.arn port = "80" protocol = "HTTP" default_action { type = "forward" target_group_arn = aws_lb_target_group.app.arn } } # ECS Service resource "aws_ecs_service" "main" { name = "${var.environment}-${var.app_name}" cluster = aws_ecs_cluster.main.id task_definition = aws_ecs_task_definition.app.arn desired_count = var.desired_count launch_type = "FARGATE" network_configuration { security_groups = [var.ecs_security_group_id] subnets = var.private_subnet_ids assign_public_ip = false } load_balancer { target_group_arn = aws_lb_target_group.app.arn container_name = var.app_name container_port = var.app_port } depends_on = [aws_lb_listener.front_end] tags = { Name = "${var.environment}-${var.app_name}" Environment = var.environment } } # CloudWatch Log Group resource "aws_cloudwatch_log_group" "app" { name = "/ecs/${var.environment}/${var.app_name}" retention_in_days = 30 tags = { Name = "${var.environment}-${var.app_name}-logs" Environment = var.environment } } # IAM Roles resource "aws_iam_role" "ecs_task_execution_role" { name = "${var.environment}-ecsTaskExecutionRole" assume_role_policy = jsonencode({ Version = "2012-10-17" Statement = [ { Action = "sts:AssumeRole" Effect = "Allow" Principal = { Service = "ecs-tasks.amazonaws.com" } } ] }) tags = { Name = "${var.environment}-ecsTaskExecutionRole" Environment = var.environment } } resource "aws_iam_role_policy_attachment" "ecs_task_execution_role" { role = aws_iam_role.ecs_task_execution_role.name policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy" } resource "aws_iam_role" "ecs_task_role" { name = "${var.environment}-ecsTaskRole" assume_role_policy = jsonencode({ Version = "2012-10-17" Statement = [ { Action = "sts:AssumeRole" Effect = "Allow" Principal = { Service = "ecs-tasks.amazonaws.com" } } ] }) tags = { Name = "${var.environment}-ecsTaskRole" Environment = var.environment } } # Data Sources data "aws_region" "current" {} # Outputs output "cluster_id" { description = "ECS cluster ID" value = aws_ecs_cluster.main.id } output "alb_dns_name" { description = "ALB DNS name" value = aws_lb.main.dns_name } output "alb_zone_id" { description = "ALB zone ID" value = aws_lb.main.zone_id }
RDS Module (modules/rds/main.tf
)
variable "environment" { description = "Environment name" type = string } variable "vpc_id" { description = "VPC ID" type = string } variable "private_subnet_ids" { description = "Private subnet IDs" type = list(string) } variable "allowed_security_groups" { description = "Security groups allowed to access RDS" type = list(string) default = [] } variable "db_name" { description = "Database name" type = string default = "appdb" } variable "db_username" { description = "Database username" type = string default = "dbadmin" } variable "db_password" { description = "Database password" type = string sensitive = true } variable "instance_class" { description = "RDS instance class" type = string default = "db.t3.micro" } variable "allocated_storage" { description = "Allocated storage in GB" type = number default = 20 } variable "backup_retention_period" { description = "Backup retention period in days" type = number default = 7 } # Security Group for RDS resource "aws_security_group" "rds" { name_prefix = "${var.environment}-rds-" vpc_id = var.vpc_id ingress { description = "MySQL/Aurora" from_port = 3306 to_port = 3306 protocol = "tcp" security_groups = var.allowed_security_groups } tags = { Name = "${var.environment}-rds-sg" Environment = var.environment } lifecycle { create_before_destroy = true } } # DB Subnet Group resource "aws_db_subnet_group" "default" { name = "${var.environment}-db-subnet-group" subnet_ids = var.private_subnet_ids tags = { Name = "${var.environment}-db-subnet-group" Environment = var.environment } } # RDS Instance resource "aws_db_instance" "default" { identifier = "${var.environment}-database" engine = "mysql" engine_version = "8.0" instance_class = var.instance_class allocated_storage = var.allocated_storage max_allocated_storage = var.allocated_storage * 2 db_name = var.db_name username = var.db_username password = var.db_password vpc_security_group_ids = [aws_security_group.rds.id] db_subnet_group_name = aws_db_subnet_group.default.name backup_retention_period = var.backup_retention_period backup_window = "03:00-04:00" maintenance_window = "sun:04:00-sun:05:00" skip_final_snapshot = true deletion_protection = false performance_insights_enabled = false monitoring_interval = 0 tags = { Name = "${var.environment}-database" Environment = var.environment } } # Outputs output "rds_hostname" { description = "RDS instance hostname" value = aws_db_instance.default.address sensitive = true } output "rds_port" { description = "RDS instance port" value = aws_db_instance.default.port } output "rds_username" { description = "RDS instance root username" value = aws_db_instance.default.username sensitive = true }
Complete Environment Configuration
Update environments/dev/main.tf
:
# Add to existing dev environment module "ecs" { source = "../../modules/ecs" environment = var.environment vpc_id = module.vpc.vpc_id private_subnet_ids = module.vpc.private_subnet_ids public_subnet_ids = module.vpc.public_subnet_ids ecs_security_group_id = module.vpc.ecs_security_group_id alb_security_group_id = module.vpc.alb_security_group_id app_name = "myapp" desired_count = 1 # Lower for dev cpu = 256 memory = 512 } module "rds" { source = "../../modules/rds" environment = var.environment vpc_id = module.vpc.vpc_id private_subnet_ids = module.vpc.private_subnet_ids allowed_security_groups = [module.vpc.ecs_security_group_id] db_password = var.db_password instance_class = "db.t3.micro" backup_retention_period = 1 # Minimal for dev } variable "db_password" { description = "Database password" type = string sensitive = true } output "alb_dns_name" { value = module.ecs.alb_dns_name }
CloudWatch Monitoring and Alerts
Add monitoring to your modules:
# CloudWatch Alarms for ECS resource "aws_cloudwatch_metric_alarm" "high_cpu" { alarm_name = "${var.environment}-high-cpu" comparison_operator = "GreaterThanThreshold" evaluation_periods = "2" metric_name = "CPUUtilization" namespace = "AWS/ECS" period = "120" statistic = "Average" threshold = "80" alarm_description = "This metric monitors ecs cpu utilization" dimensions = { ServiceName = aws_ecs_service.main.name ClusterName = aws_ecs_cluster.main.name } tags = { Name = "${var.environment}-high-cpu-alarm" Environment = var.environment } } # SNS Topic for Alerts resource "aws_sns_topic" "alerts" { name = "${var.environment}-alerts" tags = { Name = "${var.environment}-alerts" Environment = var.environment } }
Deployment Commands
Deploy your infrastructure:
# Development cd environments/dev terraform init terraform plan -var="db_password=your-secure-password" terraform apply -var="db_password=your-secure-password" # Production (copy dev to prod with appropriate sizing) cd ../prod terraform init terraform plan -var="db_password=your-secure-password" terraform apply -var="db_password=your-secure-password"
Cost Optimization Tips
Right-Sizing Resources
- Dev: t3.micro instances, minimal RDS
- Prod: Start small and scale based on metrics
- Use Fargate Spot for non-critical workloads
Resource Scheduling
# Auto-scaling for ECS resource "aws_appautoscaling_target" "ecs_target" { max_capacity = 10 min_capacity = 2 resource_id = "service/${aws_ecs_cluster.main.name}/${aws_ecs_service.main.name}" scalable_dimension = "ecs:service:DesiredCount" service_namespace = "ecs" } resource "aws_appautoscaling_policy" "scale_up" { name = "${var.environment}-scale-up" policy_type = "TargetTrackingScaling" resource_id = aws_appautoscaling_target.ecs_target.resource_id scalable_dimension = aws_appautoscaling_target.ecs_target.scalable_dimension service_namespace = aws_appautoscaling_target.ecs_target.service_namespace target_tracking_scaling_policy_configuration { predefined_metric_specification { predefined_metric_type = "ECSServiceAverageCPUUtilization" } target_value = 70.0 } }
Security Best Practices
Secrets Management
# Use AWS Secrets Manager for sensitive data resource "aws_secretsmanager_secret" "db_password" { name = "${var.environment}/database/password" } resource "aws_secretsmanager_secret_version" "db_password" { secret_id = aws_secretsmanager_secret.db_password.id secret_string = var.db_password }
Network Security
- All databases in private subnets
- Security groups with minimal access
- VPC Flow Logs for network monitoring
- WAF for public-facing applications
Production Considerations
State Management
Use remote state with S3 and DynamoDB locking:
terraform { backend "s3" { bucket = "your-terraform-state-bucket" key = "environments/prod/terraform.tfstate" region = "us-east-1" encrypt = true dynamodb_table = "terraform-state-lock" } }
Multi-Environment Strategy
- Separate AWS accounts for prod
- Environment-specific variable files
- Automated testing of Terraform changes
- GitOps workflow with pull request reviews
Conclusion
You now have production-ready AWS infrastructure that scales with your startup growth. The modular Terraform approach provides enterprise-grade reliability while remaining startup-friendly in complexity and cost.
This infrastructure foundation delivers:
- High availability across multiple AZs with automatic failover
- Scalable containerized applications using ECS Fargate
- Managed database services with automated backups and maintenance
- Comprehensive monitoring with CloudWatch metrics and alerts
- Cost optimization through right-sizing and intelligent auto-scaling
The weekend time investment in infrastructure automation creates a solid foundation that supports rapid scaling as your startup grows, while maintaining the operational reliability your customers expect.
Need help building production-ready AWS infrastructure for your startup? I specialize in Terraform consulting and can set up scalable, cost-optimized cloud infrastructure that grows with your business. Check out my DevOps services and portfolio or contact me directly to discuss your infrastructure needs.
This is part 2 of my "DevOps for Startups" series. Part 1 covered automated React deployment pipelines with GitHub Actions and AWS
Top comments (0)