 
  Data Structure Data Structure
 Networking Networking
 RDBMS RDBMS
 Operating System Operating System
 Java Java
 MS Excel MS Excel
 iOS iOS
 HTML HTML
 CSS CSS
 Android Android
 Python Python
 C Programming C Programming
 C++ C++
 C# C#
 MongoDB MongoDB
 MySQL MySQL
 Javascript Javascript
 PHP PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to use Boto3 to get the specified version table definition of a database from AWS Glue Data Catalog?
Problem Statement − Use boto3 library in Python to retrieve the table definition of a database.
Example − Retrieve the table definition of a database ‘QA-test’ and table as ‘security’ for version 2.
Approach/Algorithm to solve this problem
Step 1 − Import boto3 and botocore exceptions to handle exceptions.
Step 2 − database_name, table_name and version_id is the mandatory parameter. It fetches definition of given table for a specified version.
Step 3 − Create an AWS session using boto3 library. Make sure region_name is mentioned in default profile. If it is not mentioned, then explicitly pass the region_name while creating the session.
Step 4 − Create an AWS client for glue.
Step 5 − Now use get_table_version function and pass the database_name as DatabaseName, table_name as TableName and version_id as VersionId parameter. Please note version_id is string so integer value should be passed as string in inverted commas.
Step 6 − It returns the definition of a given table for a specified version.
Step 7 − Handle the generic exception if something went wrong while checking the job.
Example
Use the following code to retrieve the table definition for a specified version −
import boto3 from botocore.exceptions import ClientError def retrieves_table_version_details(database_name, table_name, version_id)    session = boto3.session.Session()    glue_client = session.client('glue')    try:       response = glue_client.get_table_version(DatabaseName = database_name, TableName = table_name, VersionId = version_id)       return response    except ClientError as e:       raise Exception("boto3 client error in retrieves_table_version_details: " + e.__str__())    except Exception as e:       raise Exception("Unexpected error in retrieves_table_version_details: " + e.__str__()) print(retrieves_table_version_details('QA-test', 'security', '2')) Output
{'TableVersion': {'Table': {'Name': 'security', 'DatabaseName': 'QAtest', 'Owner': 'owner', 'CreateTime': datetime.datetime(2020, 9, 10, 22, 27, 24, tzinfo=tzlocal()), 'UpdateTime': datetime.datetime(2021, 3, 1, 11, 43, 49, tzinfo=tzlocal()), 'LastAccessTime': datetime.datetime(2020, 9, 10, 22, 27, 24, tzinfo=tzlocal()), 'Retention': 0, 'StorageDescriptor': {'Columns': [{'Name': 'assettypecode', 'Type': 'string'}, {'Name': 'industrysector', 'Type': 'varchar'}, {'Name': 'securitycode', 'Type': 'char'}, {'Name': 'contractsize', 'Type': 'string'}, {'Name': 'conversionperiodenddate', 'Type': 'string'}, {'Name': 'conversionperiodstartdate', 'Type': 'string'}, {'Name': 'expirationdate', 'Type': 'string'}, {'Name': 'issuercountrycode', 'Type': 'string'}, {'Name': 'issuercountrydesc', 'Type': 'string'}, {'Name': 'originalissuedate', 'Type': 'string'}, {'Name': 'securitynamelong', 'Type': 'string'}, {'Name': 'issueshortname', 'Type': 'string'}, {'Name': 'gicssector', 'Type': 'string'}, {'Name': 'maturitydate', 'Type': 'string'}, {'Name': 'optioncode', 'Type': 'string'}, {'Name': 'optiontypename', 'Type': 'string'}, {'Name': 'paramount', 'Type': 'string'}, {'Name': 'priceindex', 'Type': 'string'}, {'Name': 'countrycoderisk', 'Type': 'string'}, {'Name': 'countrydescrisk', 'Type': 'string'}, {'Name': 'countrycode', 'Type': 'string'}], 'Location': 's3://test/security/', 'InputFormat': 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat', 'OutputFormat': 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat', 'Compressed': False, 'NumberOfBuckets': -1, 'SerdeInfo': {'SerializationLibrary': 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe', 'Parameters': {'serialization.format': '1'}}, 'BucketColumns': [], 'SortColumns': [], 'Parameters': {'CrawlerSchemaDeserializerVersion': '1.0', 'CrawlerSchemaSerializerVersion': '1.0', 'UPDATED_BY_CRAWLER': 'security', 'averageRecordSize': '181', 'classification': 'parquet', 'compressionType': 'none', 'objectCount': '5', 'recordCount': '154800', 'sizeKey': '20337230', 'typeOfData': 'file'}, 'StoredAsSubDirectories': False}, 'PartitionKeys': [], 'TableType': 'EXTERNAL_TABLE', 'Parameters': {'CrawlerSchemaDeserializerVersion': '1.0', 'CrawlerSchemaSerializerVersion': '1.0', 'UPDATED_BY_CRAWLER': 'security', 'averageRecordSize': '181', 'classification': 'parquet', 'compressionType': 'none', 'objectCount': '5', 'recordCount': '154800', 'sizeKey': '20337230', 'typeOfData': 'file'}, 'CreatedBy': 'arn:aws:sts::*********:assumed-role/glue-role/AWS-Crawler'}, 'VersionId': '2'}, 'ResponseMetadata': {'RequestId': '431db171- *******************0', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Mon, 01 Mar 2021 06:15:30 GMT', 'content-type': 'application/x-amzjson-1.1', 'content-length': '3916', 'connection': 'keep-alive', 'xamzn-requestid': '431db171-*****************0'}, 'RetryAttempts': 0}}