This is a Python client for the Unstructured API.
pip install unstructured-client
Only the files
parameter is required. See the general partition page for all available parameters.
from unstructured_client import UnstructuredClient from unstructured_client.models import shared from unstructured_client.models.errors import SDKError s = UnstructuredClient(api_key_auth="YOUR_API_KEY") filename = "sample-docs/layout-parser-paper.pdf" file = open(filename, "rb") req = shared.PartitionParameters( # Note that this currently only supports a single file files=shared.PartitionParametersFiles( content=file.read(), files=filename, ), # Other partition params strategy="fast", ) try: res = s.general.partition(req) print(res.elements[0]) except SDKError as e: print(e) # { # 'type': 'Title', # 'element_id': '015301d4f56aa4b20ec10ac889d2343f', # 'metadata': {'filename': 'layout-parser-paper.pdf', 'filetype': 'application/pdf', 'page_number': 1}, # 'text': 'LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis' # }
If you are self hosting the API, or developing locally, you can change the server URL when setting up the client.
# Using a local server s = unstructured_client.UnstructuredClient( server_url="http://localhost:8000", security=shared.Security( api_key_auth=api_key, ), ) # Using your own server s = unstructured_client.UnstructuredClient( server_url="https://your-server", security=shared.Security( api_key_auth=api_key, ), )
This SDK is in beta, and there may be breaking changes between versions without a major version update. Therefore, we recommend pinning usage to a specific package version. This way, you can install the same version each time without breaking changes unless you are intentionally looking for the latest version.
While we value open-source contributions to this SDK, this library is generated programmatically. Feel free to open a PR or a Github issue as a proof of concept and we'll do our best to include it in a future release!