In the previous part we created our EKS cluster. In this part we will configure the Amazon RDS Instance.
The following resources will be created:
- A Private Multi-AZ RDS PostgreSQL Instance.
- VPC Subnet Security Group.
- (Optional) Network load balancer to expose private RDS to a specific range of IP addresses.
- (Optional) A Lambda to populate NLB Target Group with RDS private IP.
Amazon RDS
- The Amazon RDS Instance used is a PostgreSQL database server.
- A multi-az option is enabled to ensure high-availability.
- The Instance is not publicly accessible and it's hosted in private subnets.
- All authentication is done through the IAM database authentication.
- Automated backup is enabled.
Create a terraform file infra/plan/rds.tf
resource "random_string" "db_suffix" { length = 4 special = false upper = false } resource "random_string" "root_username" { length = 12 special = false upper = true } resource "random_password" "root_password" { length = 12 special = true upper = true } resource "aws_db_instance" "postgresql" { # Engine options engine = "postgres" engine_version = "12.5" # Settings name = "postgresql${var.env}" identifier = "postgresql-${var.env}" # Credentials Settings username = "u${random_string.root_username.result}" password = "p${random_password.root_password.result}" # DB instance size instance_class = "db.m5.large" # Storage storage_type = "gp2" allocated_storage = 100 max_allocated_storage = 200 # Availability & durability multi_az = true # Connectivity db_subnet_group_name = aws_db_subnet_group.sg.id publicly_accessible = false vpc_security_group_ids = [aws_security_group.sg.id] port = var.rds_port # Database authentication iam_database_authentication_enabled = true # Additional configuration parameter_group_name = "default.postgres12" # Backup backup_retention_period = 14 backup_window = "03:00-04:00" final_snapshot_identifier = "postgresql-final-snapshot-${random_string.db_suffix.result}" delete_automated_backups = true skip_final_snapshot = false # Encryption storage_encrypted = true # Maintenance auto_minor_version_upgrade = true maintenance_window = "Sat:00:00-Sat:02:00" # Deletion protection deletion_protection = false tags = { Environment = var.env } }
Add the following outputs
output "rds-username" { value = "u${random_string.root_username.result}" } output "rds-password" { value = "p${random_password.root_password.result}" } output "private-rds-endpoint" { value = aws_db_instance.postgresql.address }
DB Subnet Group
We deploy the Amazon RDS Instance on private subnets.
resource "aws_db_subnet_group" "sg" { name = "postgresql-${var.env}" subnet_ids = [aws_subnet.private["private-rds-1"].id, aws_subnet.private["private-rds-2"].id] tags = { Environment = var.env Name = "postgresql-${var.env}" } }
VPC Security Group
In the VPC Security group we allow:
- inbound / outbound traffic on port 5432 with RDS public subnets.
- inbound / outbound TCP network traffic between RDS private subnets.
resource "aws_security_group" "sg" { name = "postgresql-${var.env}" description = "Allow inbound/outbound traffic" vpc_id = aws_vpc.main.id ingress { from_port = var.rds_port to_port = var.rds_port protocol = "tcp" cidr_blocks = [aws_subnet.private["private-rds-1"].cidr_block] } ingress { from_port = var.rds_port to_port = var.rds_port protocol = "tcp" cidr_blocks = [aws_subnet.private["private-rds-2"].cidr_block] } ingress { from_port = var.rds_port to_port = var.rds_port protocol = "tcp" cidr_blocks = [aws_subnet.public["public-rds-1"].cidr_block] } ingress { from_port = var.rds_port to_port = var.rds_port protocol = "tcp" cidr_blocks = [aws_subnet.public["public-rds-2"].cidr_block] } egress { from_port = 0 to_port = 65535 protocol = "tcp" cidr_blocks = [aws_subnet.private["private-rds-1"].cidr_block] } egress { from_port = 0 to_port = 65535 protocol = "tcp" cidr_blocks = [aws_subnet.private["private-rds-2"].cidr_block] } egress { from_port = var.rds_port to_port = var.rds_port protocol = "tcp" cidr_blocks = [aws_subnet.public["public-rds-1"].cidr_block] } egress { from_port = var.rds_port to_port = var.rds_port protocol = "tcp" cidr_blocks = [aws_subnet.public["public-rds-2"].cidr_block] } tags = { Name = "postgresql-${var.env}" Environment = var.env } }
(Optional) Exposing the RDS instance
If you want to access the RDS instance databases from your local machine or through an external CI / CD tool, you can create an external network load balancer and target the private IP address of the RDS instance. As the private IP address in the network interface can change if an instance fails, a Lambda function can be deployed to continuously check the current private IP address, unregister the old IP address, and register a new target with the new private IP address.
Network Load Balancer
In order to reach the RDS private IP address, the RDS instance and the external network load balancer must be in the same Availability Zones. Thus, the NLB will be deployed in the same subnet as the primary RDS instance.
We create a target group with a target type of IP address. A Cloud Watch alarm has been added to monitor connectivity between NLB and RDS.
Create the file infra/plan/nlb.tf
locals { subnet_id = aws_subnet.public["public-rds-1"].availability_zone == aws_db_instance.postgresql.availability_zone ? aws_subnet.public["public-rds-1"].id : aws_subnet.public["public-rds-2"].id } resource "aws_lb" "rds" { name = "nlb-expose-rds-${var.env}" internal = false load_balancer_type = "network" subnets = [local.subnet_id] enable_deletion_protection = false tags = { Environment = var.env } } resource "aws_lb_listener" "rds" { load_balancer_arn = aws_lb.rds.id port = var.rds_port protocol = "TCP" default_action { target_group_arn = aws_lb_target_group.rds.id type = "forward" } } resource "aws_lb_target_group" "rds" { name = "expose-rds-${var.env}" port = var.rds_port protocol = "TCP" target_type = "ip" vpc_id = aws_vpc.main.id health_check { enabled = true protocol = "TCP" } tags = { Environment = var.env } } resource "aws_cloudwatch_metric_alarm" "rds-access" { alarm_name = "rds-external-access-status" comparison_operator = "GreaterThanOrEqualToThreshold" evaluation_periods = "1" metric_name = "UnHealthyHostCount" namespace = "AWS/NetworkELB" period = "60" statistic = "Maximum" threshold = 1 alarm_description = "Monitoring RDS External Access" treat_missing_data = "breaching" dimensions = { TargetGroup = aws_lb_target_group.rds.arn_suffix LoadBalancer = aws_lb.rds.arn_suffix } }
Complete the file infra/plan/output
output "public-rds-endpoint" { value = "${element(split("/", aws_lb.rds.arn), 2)}-${element(split("/", aws_lb.rds.arn), 3)}.elb.${var.region}.amazonaws.com" }
Now we need to register a target. A Lambda function can be used to perform the task. An Amazon CloudWatch event rule is added to invoke the Lambda function every 15 minutes.
Lambda function
Create the file infra/plan/lambda.tf
data "archive_file" "lambda_zip" { type = "zip" source_file = "${path.module}/lambda/populate-nlb-tg-with-rds-private-ip.py" output_path = "lambda_function_payload.zip" } resource "aws_lambda_function" "rds" { filename = "lambda_function_payload.zip" function_name = "populate-nlb-tg-with-rds-private-ip" role = aws_iam_role.iam_for_lambda.arn handler = "populate-nlb-tg-with-rds-private-ip.handler" source_code_hash = data.archive_file.lambda_zip.output_base64sha256 runtime = "python3.8" timeout = 300 environment { variables = { RDS_PORT = var.rds_port NLB_TG_ARN = aws_lb_target_group.rds.arn RDS_SG_ID = aws_security_group.sg.id RDS_ID = aws_db_instance.postgresql.id } } tags = { Environment = var.env } } resource "aws_iam_role" "iam_for_lambda" { name = "iam_for_lambda" assume_role_policy = <<EOF { "Version": "2012-10-17", "Statement": [ { "Action": "sts:AssumeRole", "Principal": { "Service": "lambda.amazonaws.com" }, "Effect": "Allow", "Sid": "" } ] } EOF } resource "aws_iam_role_policy" "lambda_nlb" { name = "nlb-tg-access" role = aws_iam_role.iam_for_lambda.id policy = jsonencode({ Version = "2012-10-17" Statement = [ { Action = [ "ec2:DescribeNetworkInterfaces", "elasticloadbalancing:DeregisterTargets", "elasticloadbalancing:DescribeTargetHealth", "elasticloadbalancing:RegisterTargets", "rds:DescribeDBInstances" ] Effect = "Allow" Resource = "*" }, ] }) } resource "aws_iam_role_policy" "lambda_logging" { name = "lambda_logging" role = aws_iam_role.iam_for_lambda.id policy = jsonencode({ Version = "2012-10-17" Statement = [ { Action = [ "logs:CreateLogGroup", "logs:CreateLogStream", "logs:PutLogEvents" ] Effect = "Allow" Resource = "arn:aws:logs:*:*:*" }, ] }) } resource "aws_cloudwatch_log_group" "lambda" { name = "/aws/lambda/${aws_lambda_function.rds.function_name}" retention_in_days = 1 } resource "aws_cloudwatch_event_rule" "lambda" { name = "populate-nlb-tg-with-rds-private-ip" description = "Populate NLB tg with RDS private IP" schedule_expression = "rate(15 minutes)" } resource "aws_cloudwatch_event_target" "lambda" { rule = aws_cloudwatch_event_rule.lambda.name target_id = "Lambda" arn = aws_lambda_function.rds.arn } resource "aws_lambda_permission" "cloudwatch" { statement_id = "AllowExecutionFromCloudWatch" action = "lambda:InvokeFunction" function_name = aws_lambda_function.rds.function_name principal = "events.amazonaws.com" source_arn = aws_cloudwatch_event_rule.lambda.arn }
The lambda function is written in python. The process is as follows:
- Getting the current registered IPs using
describe_target_health
function. - Getting the current RDS Instance availability zone using
describe_db_instances
function. - Searching the current RDS private IP using
describe_network_interfaces
function. - If no
Registry Target
has already been assigned, we create a new one. If the current register target is old, we deregister it and create a new register target with the new RDS private IP address.
Create the file infra/plan/lambda/populate-nlb-tg-with-rds-private-ip.py
import json import os import random import sys import boto3 import logging from datetime import datetime from botocore.exceptions import ClientError logger = logging.getLogger() logger.setLevel(logging.INFO) ''' This function populates a Network Load Balancer's target group with RDS IP addresses Configure these environment variables in your Lambda environment 1. NLB_TG_ARN - The ARN of the Network Load Balancer's target group 2. RDS_PORT 3. RDS_SG_ID - RDS VPC Security Group Id 4. RDS_ID - RDS Identifier ''' NLB_TG_ARN = os.environ['NLB_TG_ARN'] RDS_PORT = int(os.environ['RDS_PORT']) RDS_SG_ID = os.environ['RDS_SG_ID'] RDS_ID = os.environ['RDS_ID'] try: elbv2client = boto3.client('elbv2') except ClientError as e: logger.error(e.response['Error']['Message']) sys.exit(1) try: rdsclient = boto3.client('rds') except ClientError as e: logger.error(e.response['Error']['Message']) sys.exit(1) try: ec2client = boto3.client('ec2') except ClientError as e: logger.error(e.response['Error']['Message']) sys.exit(1) def register_target(tg_arn, new_target_list): logger.info(f"INFO: Register new_target_list:{new_target_list}") try: elbv2client.register_targets( TargetGroupArn=tg_arn, Targets=new_target_list ) except ClientError as e: logger.error(e.response['Error']['Message']) def deregister_target(tg_arn, new_target_list): try: logger.info(f"INFO: Deregistering targets: {new_target_list}") elbv2client.deregister_targets( TargetGroupArn=tg_arn, Targets=new_target_list ) except ClientError as e: logger.error(e.response['Error']['Message']) def target_group_list(ip_list): target_list = [] for ip in ip_list: target = { 'Id': ip, 'Port': RDS_PORT, } target_list.append(target) return target_list def get_registered_ips(tg_arn): registered_ip_list = [] try: response = elbv2client.describe_target_health( TargetGroupArn=tg_arn) registered_ip_count = len(response['TargetHealthDescriptions']) logger.info(f"INFO: Number of currently registered IP: {registered_ip_count}") for target in response['TargetHealthDescriptions']: registered_ip = target['Target']['Id'] registered_ip_list.append(registered_ip) except ClientError as e: logger.error(e.response['Error']['Message']) return registered_ip_list def get_rds_private_ips(rds_az): resp = ec2client.describe_network_interfaces(Filters=[{ 'Name': 'group-id', 'Values': [RDS_SG_ID] }, { 'Name': 'availability-zone', 'Values': [rds_az] }]) private_ip_address = [] for interface in resp['NetworkInterfaces']: private_ip_address.append(interface['PrivateIpAddress']) return private_ip_address def get_rds_az(): logger.info(f"INFO: Get RDS current AZ: {RDS_ID}") az = None try: response = rdsclient.describe_db_instances( DBInstanceIdentifier=RDS_ID ) if len(response['DBInstances']) > 0: az = response['DBInstances'][0]['AvailabilityZone'] logger.info(f"INFO: RDS AZ is: {az}") except ClientError as e: logger.error(e.response['Error']['Message']) return az def handler(event, context): registered_ip_list = get_registered_ips(NLB_TG_ARN) current_rds_az = get_rds_az() new_active_ip_set = get_rds_private_ips(current_rds_az) registration_ip_list = [] # IPs that have not been registered if len(registered_ip_list) == 0 or registered_ip_list != new_active_ip_set: registration_ip_list = new_active_ip_set if registration_ip_list: registerTarget_list = target_group_list(registration_ip_list) register_target(NLB_TG_ARN, registerTarget_list) logger.info(f"INFO: Registering {registration_ip_list}") else: logger.info(f"INFO: No new target registered") deregistration_ip_list = [] if registered_ip_list != new_active_ip_set: for ip in registered_ip_list: deregistration_ip_list.append(ip) logger.info(f"INFO: Deregistering IP: {ip}") deregisterTarget_list = target_group_list(deregistration_ip_list) deregister_target(NLB_TG_ARN, deregisterTarget_list) else: logger.info(f"INFO: No old target deregistered")
Complete the file infra/plan/variable.tf
:
variable "rds_port" { type = number default = 5432 }
Let's deploy our RDS instance
cd infra/envs/dev terraform apply ../../plan/
Before going to the next part, we will need to create the metabase
database on Amazon RDS instance:
PGPASSWORD=$(terraform output rds-password) psql --host $(terraform output public-rds-endpoint) --port 5432 --user $(terraform output rds-username) --dbname postgres CREATE USER metabase; GRANT rds_iam TO metabase; CREATE DATABASE metabase; GRANT ALL ON DATABASE metabase TO metabase;
Let's check if all the resources have been created and are working correctly
RDS Instance
VPC Security Group
Lambda
NLB Target Group
Conclusion
Our RDS instance is now available. In the next part, we'll establish a connection between a container deployed in Amazon EKS and a database created in an Amazon RDS instance.
Top comments (1)
Not sure how you use the external NLB to expose the RDS instance only to a specific IP addresses range? So if I understand, when requesting access to the DB from outside the VPN, we pass by the ITG that we associated to the public subnets and that lets through any IP ("0.0.0.0/0"). When we land in the public subnets (i.e. external zone), we face the external load balancer, that listens TCP on port 5432 and routes it to the RDS instance.
However it doesn't seem to be routed properly from outside the VPC, and psql times out when reaching the NLB public endpoint. Adding ingress rules on the security group does not solve the issue. Any suggestion about where to look for making that NLB work as expected?