DEV Community

Cover image for Automating EC2 Instance and Tag Management with AWS SDK and PostgreSQL
Dmitry Romanoff
Dmitry Romanoff

Posted on

Automating EC2 Instance and Tag Management with AWS SDK and PostgreSQL

Introduction:

Managing EC2 instances at scale across multiple AWS regions can become a daunting task. Whether you're keeping track of instance details, monitoring their status, or managing tags, the process can be time-consuming. In this article, we’ll explore how you can automate the collection of EC2 instance information and its tags across multiple AWS regions, and store this data in a PostgreSQL database for easy querying and reporting. We’ll leverage Python, the boto3 library for interacting with AWS, and psycopg2 for PostgreSQL integration.

Prerequisites

To follow along with this tutorial, you will need:

  • AWS credentials configured on your machine. You can set them up using aws configure.
  • Python 3.x installed on your machine along with the following libraries:
    • boto3: AWS SDK for Python
    • psycopg2: PostgreSQL adapter for Python

You can install the required libraries using:

pip install boto3 psycopg2 
Enter fullscreen mode Exit fullscreen mode

Additionally, you should have a PostgreSQL database running, either locally or in the cloud.

Step 1: Setting Up AWS and PostgreSQL Connections

We start by configuring the AWS EC2 client and PostgreSQL connection parameters. The following code sets up the database details as well as the function for fetching EC2 regions.

import boto3 import psycopg2 from psycopg2 import sql # PostgreSQL connection details DB_HOST = "..." DB_PORT = "5432" # default port for PostgreSQL DB_NAME = "..." DB_USER = "..." DB_PASSWORD = "..." # Function to get all EC2 regions def get_all_regions(): ec2_client = boto3.client('ec2') response = ec2_client.describe_regions() regions = [region['RegionName'] for region in response['Regions']] return regions 
Enter fullscreen mode Exit fullscreen mode

Here, get_all_regions() uses the boto3 EC2 client to fetch a list of all available regions. These will be used to gather EC2 data across all regions in the subsequent steps.

Step 2: Fetching EC2 Instances and Their Tags

Once we have the list of regions, we need to gather information about EC2 instances and their tags. For this, we define the get_ec2_instances_and_tags() function:

# Function to get EC2 instances and tags for a given region def get_ec2_instances_and_tags(region): ec2_client = boto3.client('ec2', region_name=region) # Describe EC2 instances and tags  instances = ec2_client.describe_instances() tags = ec2_client.describe_tags() ec2_data = {} # Process EC2 Instances  for reservation in instances['Reservations']: for instance in reservation['Instances']: instance_id = instance['InstanceId'] instance_name = None instance_type = instance['InstanceType'] instance_state = instance['State']['Name'] region = instance['Placement']['AvailabilityZone'][:-1] # Removing the AZ suffix to get the region  # Try to get instance name from the tags  for tag in instance.get('Tags', []): if tag['Key'] == 'Name': instance_name = tag['Value'] ec2_data[instance_id] = { 'instance_id': instance_id, 'instance_name': instance_name, 'instance_type': instance_type, 'status': instance_state, 'region': region, 'tags': {} } # Process EC2 Tags  for tag in tags['Tags']: instance_id = tag['ResourceId'] key = tag['Key'] value = tag['Value'] if instance_id in ec2_data: ec2_data[instance_id]['tags'][key] = value return ec2_data 
Enter fullscreen mode Exit fullscreen mode

In this function:

  • We fetch the EC2 instances in a specific region and their associated tags.
  • The instance details (like ID, type, and status) are stored in a dictionary.
  • Tags are also gathered and added to the corresponding EC2 instance data.

Step 3: Inserting EC2 Data into PostgreSQL

Next, we set up the insert_ec2_data_into_db() function that connects to the PostgreSQL database and inserts the EC2 data:

# Function to connect to PostgreSQL and insert data def insert_ec2_data_into_db(ec2_data): # Connect to your PostgreSQL server  conn = psycopg2.connect( dbname=DB_NAME, user=DB_USER, password=DB_PASSWORD, host=DB_HOST, port=DB_PORT ) cur = conn.cursor() # Create tables  cur.execute(""" CREATE TABLE IF NOT EXISTS ec2_instances ( instance_id VARCHAR(50) PRIMARY KEY, instance_name VARCHAR(100), instance_type VARCHAR(50), status VARCHAR(50), region VARCHAR(50) ); """) cur.execute(""" CREATE TABLE IF NOT EXISTS ec2_tags ( tag_id SERIAL PRIMARY KEY, instance_id VARCHAR(50), tag_key VARCHAR(100), tag_value VARCHAR(255), FOREIGN KEY (instance_id) REFERENCES ec2_instances (instance_id) ON DELETE CASCADE ); """) cur.execute(""" TRUNCATE TABLE ec2_tags; TRUNCATE TABLE ec2_instances CASCADE; """) # Insert EC2 instances and tags  for instance_id, instance_data in ec2_data.items(): # Insert instance data  cur.execute(""" INSERT INTO ec2_instances (instance_id, instance_name, instance_type, status, region) VALUES (%s, %s, %s, %s, %s) ON CONFLICT (instance_id) DO NOTHING; """, ( instance_id, instance_data['instance_name'], instance_data['instance_type'], instance_data['status'], instance_data['region'] )) # Insert tags  for tag_key, tag_value in instance_data['tags'].items(): cur.execute(""" INSERT INTO ec2_tags (instance_id, tag_key, tag_value) VALUES (%s, %s, %s); """, (instance_id, tag_key, tag_value)) # Commit and close connection  conn.commit() cur.close() conn.close() 
Enter fullscreen mode Exit fullscreen mode

Here, we:

  • Create tables (ec2_instances and ec2_tags) to store instance details and tags.
  • Use TRUNCATE to clear the existing data before inserting fresh EC2 data.
  • Insert EC2 instance details and their tags into the database.

Step 4: Putting It All Together

Finally, we integrate all these components in the main() function:

# Main function if __name__ == '__main__': regions = get_all_regions() ec2_data = {} # Iterate through each region and fetch EC2 instances and tags  for region in regions: print(f"Fetching EC2 instances from region: {region}") region_data = get_ec2_instances_and_tags(region) ec2_data.update(region_data) # Combine the data from each region  # Insert the fetched EC2 data into the database  insert_ec2_data_into_db(ec2_data) print("EC2 instances and tags have been inserted into the database.") 
Enter fullscreen mode Exit fullscreen mode

Step 5: Running Queries

Once your EC2 data is stored in PostgreSQL, you can run SQL queries to generate reports. Here’s a simple query that joins the EC2 instances and their tags:

SELECT t1.*, t2.tag_key, t2.tag_value FROM ec2_instances t1 JOIN ec2_tags t2 ON t1.instance_id = t2.instance_id ORDER BY t1.instance_id, t2.tag_key, t2.tag_value; 
Enter fullscreen mode Exit fullscreen mode

This query will give you a detailed report of all EC2 instances along with their associated tags.

Conclusion

In this article, we demonstrated how to automate the retrieval of EC2 instances and tags across all AWS regions using Python and the boto3 library. We then stored this data in a PostgreSQL database and outlined how to run SQL queries for reporting purposes. This solution can be a game changer for managing EC2 instances at scale, especially when you need to report on or monitor large sets of resources across multiple regions.

By automating this process, you save time and ensure that your EC2 data is always up to date and easily accessible for any analysis or reporting needs.

Top comments (0)