Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pipelines (GenAI, LLM, VLLM) into your applications, supporting various tasks such as document cleanup, optical character recognition (OCR), classification, splitting, named entity recognition, and form processing
python docker ocr pytorch omr optical-character-recognition optical-mark-recognition icr document-parser document-layout-analysis table-recognition table-detection publaynet intelligent-character-recognition intelligent-word-recognition iwr pubtabnet
- Updated
Dec 12, 2025 - Python