Skip to main content

Task

You want to get, decode, and show elements, such as images and tables, that are embedded in a PDF document.

Approach

Extract the Base64-encoded representation of specific elements, such as images and tables, in the document. For each of these extracted elements, decode the Base64-encoded representation of the element into its original visual representation and then show it.

To run this example

You will need a document that is one of the document types supported by the extract_image_block_types argument. See the extract_image_block_types entry in API Parameters. This example uses a PDF file with embedded images and tables.

Code

For the Unstructured Python SDK, you’ll need: These environment variables:
  • UNSTRUCTURED_API_KEY - Your Unstructured API key value.
  • UNSTRUCTURED_API_URL - Your Unstructured API URL.
Python SDK
from unstructured_client import UnstructuredClient from unstructured_client.models import operations, shared from unstructured.staging.base import elements_from_dicts, elements_to_json  import os import base64 from PIL import Image import io  if __name__ == "__main__":  client = UnstructuredClient(  api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")  )   # Source: https://github.com/Unstructured-IO/unstructured/blob/main/example-docs/embedded-images-tables.pdf    # Where to get the input file and store the processed data, relative to this .py file.  local_input_filepath = "local-ingest-input-pdf/embedded-images-tables.pdf"  local_output_filepath = "local-ingest-output/embedded-images-tables.json"   with open(local_input_filepath, "rb") as f:  files = shared.Files(  content=f.read(),  file_name=local_input_filepath  )   request = operations.PartitionRequest(  shared.PartitionParameters(  files=files,  split_pdf_page=True,  split_pdf_allow_failed=True,  split_pdf_concurrency_level=15,  # Extract the Base64-encoded representation of each  # processed "Image" and "Table" element. Extract each into  # an "image_base64" object, as a child of the  # "metadata" object, for that element in the result.  # Element type names, such as "Image" and "Table" here,  # are case-insensitive.  # Any available Unstructured element type is allowed.  extract_image_block_types=["Image", "Table"]  )  )   try:  result = client.general.partition(  request=request  )   for element in result.elements:  if "image_base64" in element["metadata"]:  # Decode the Base64-encoded representation of the   # processed "Image" or "Table" element into its original  # visual representation, and then show it.  image_data = base64.b64decode(element["metadata"]["image_base64"])  image = Image.open(io.BytesIO(image_data))  image.show()    # Optionally, prepare to print or save the elements as JSON.  dict_elements = elements_from_dicts(  element_dicts=result.elements  )   # Print the elements as JSON...  json_elements = elements_to_json(  elements=dict_elements,  indent=2  )   print(json_elements)   # ...or save as JSON.  elements_to_json(  elements=dict_elements,  indent=2,  filename=local_output_filepath  )  except Exception as e:  print(e) 

See also

⌘I