A Python wrapper for the Dynamsoft Document Normalizer SDK, providing simple and user-friendly APIs across Windows, Linux, and macOS. Compatible with desktop PCs, embedded devices, Raspberry Pi, and Jetson Nano.
Note: This is an unofficial, community-maintained wrapper. For official support and full feature coverage, consider the Dynamsoft Capture Vision Bundle on PyPI.
| Feature | Community Wrapper | Official Dynamsoft SDK |
|---|---|---|
| Support | Community-driven | âś… Official Dynamsoft support |
| Documentation | Basic README and limited examples | âś… Comprehensive online documentation |
| API Coverage | Core features only | âś… Full API coverage |
| Updates | May lag behind | âś… Always includes the latest features |
| Testing | Tested in limited environments | âś… Thoroughly tested |
| API Usage | âś… Simple and intuitive | More complex and verbose |
-
Python 3.x
-
OpenCV (for UI display)
pip install opencv-python
-
Dynamsoft Capture Vision Bundle SDK
pip install dynamsoft-capture-vision-bundle
# Source distribution python setup.py sdist # Build wheel python setup.py bdist_wheelAfter installation, you can use the built-in command-line interface:
# Scan document from image file scandocument -f <file-name> -l <license-key> # Scan documents from camera (camera index 0) scandocument -c 1 -l <license-key>import docscanner import cv2 # Initialize license (required) docscanner.initLicense("YOUR_LICENSE_KEY") # Get trial key from Dynamsoft # Create scanner instance scanner = docscanner.createInstance() # Detect from image file results = scanner.detect("document.jpg") # OR detect from OpenCV image matrix image = cv2.imread("document.jpg") results = scanner.detect(image) # Process results for result in results: print(f"Document found:") print(f" Top-left: ({result.x1}, {result.y1})") print(f" Top-right: ({result.x2}, {result.y2})") print(f" Bottom-right: ({result.x3}, {result.y3})") print(f" Bottom-left: ({result.x4}, {result.y4})") # Draw detection rectangle import numpy as np corners = np.array([(result.x1, result.y1), (result.x2, result.y2), (result.x3, result.y3), (result.x4, result.y4)]) cv2.drawContours(image, [corners.astype(int)], -1, (0, 255, 0), 2) cv2.imshow("Detected Documents", image) cv2.waitKey(0)import docscanner import cv2 from docscanner import * # Setup (license + scanner) docscanner.initLicense("YOUR_LICENSE_KEY") scanner = docscanner.createInstance() # Detect documents results = scanner.detect("skewed_document.jpg") if results: result = results[0] # Process first detected document # Normalize the document (correct perspective) - now returns the image normalized_img = scanner.normalize(result, EnumImageColourMode.ICM_COLOUR) # Use the returned normalized image directly if normalized_img is not None: cv2.imshow("Original", cv2.imread("skewed_document.jpg")) cv2.imshow("Normalized", normalized_img) cv2.waitKey(0) # Save normalized image cv2.imwrite("normalized_document.jpg", normalized_img) print("Normalized document saved!") import docscanner import cv2 import numpy as np def on_document_detected(results): """Callback function for async document detection""" for result in results: print(f"Document detected at ({result.x1},{result.y1}), ({result.x2},{result.y2}), ({result.x3},{result.y3}), ({result.x4},{result.y4})") # Setup docscanner.initLicense("YOUR_LICENSE_KEY") scanner = docscanner.createInstance() # Start async detection scanner.addAsyncListener(on_document_detected) # Camera loop cap = cv2.VideoCapture(0) while True: ret, frame = cap.read() if not ret: break # Queue frame for async processing scanner.detectMatAsync(frame) # Display frame cv2.imshow("Document Scanner", frame) key = cv2.waitKey(1) & 0xFF if key == ord('q'): break # Cleanup scanner.clearAsyncListener() cap.release() cv2.destroyAllWindows()Initialize the Dynamsoft license. Required before using any other functions.
Parameters:
license_key: Your Dynamsoft license key
Returns:
(error_code, error_message): License initialization result
Example:
error_code, error_msg = docscanner.initLicense("YOUR_LICENSE_KEY") if error_code != 0: print(f"License error: {error_msg}")Create a new DocumentScanner instance.
Returns:
DocumentScanner: Ready-to-use scanner instance
Detect documents from various input sources (unified detection method).
Parameters:
input: Input source for document detection:str: File path to image (JPEG, PNG, BMP, TIFF, etc.)numpy.ndarray: OpenCV image matrix (BGR or grayscale)
Returns:
List[DocumentResult]: List of detected documents with boundary coordinates
Examples:
# Detect from file path results = scanner.detect("document.jpg") # Detect from OpenCV matrix import cv2 image = cv2.imread("document.jpg") results = scanner.detect(image) # Process results for result in results: print(f"Found document at ({result.x1},{result.y1}), ({result.x2},{result.y2}), ({result.x3},{result.y3}), ({result.x4},{result.y4})")Start asynchronous document detection with callback.
Parameters:
callback: Function called with detection results
Example:
def on_documents_found(results): print(f"Found {len(results)} documents") scanner.addAsyncListener(on_documents_found)Queue an image for asynchronous processing.
Parameters:
image: OpenCV image to process
Stop asynchronous processing and remove callback.
Perform document normalization (perspective correction) on a detected document.
Parameters:
document: DocumentResult containing boundary coordinates and source imagecolor: Color mode for output (ICM_COLOUR, ICM_GRAYSCALE, or ICM_BINARY)
Returns:
numpy.ndarray or None: The normalized document image as numpy array, or None if normalization fails
Usage Patterns:
# Method 1: Use return value directly normalized_img = scanner.normalize(result, EnumImageColourMode.ICM_COLOUR) if normalized_img is not None: cv2.imshow("Normalized", normalized_img) # Method 2: Access from document object (also available) scanner.normalize(result, EnumImageColourMode.ICM_COLOUR) if result.normalized_image is not None: cv2.imwrite("output.jpg", result.normalized_image)Container for document detection results.
Attributes:
x1, y1: Top-left corner coordinatesx2, y2: Top-right corner coordinatesx3, y3: Bottom-right corner coordinatesx4, y4: Bottom-left corner coordinatessource: Original image (file path or numpy array)normalized_image: Perspective-corrected image (numpy array)
Convert OpenCV matrix to Dynamsoft ImageData format.
Parameters:
mat: OpenCV image (RGB, BGR, or grayscale)
Returns:
ImageData: SDK-compatible image data
Convert Dynamsoft ImageData back to OpenCV-compatible numpy array.
Parameters:
normalized_image: ImageData object from SDK normalization results
Returns:
numpy.ndarray: OpenCV-compatible image matrix
Supported Formats:
- Binary images (1-bit): Converted to 8-bit grayscale
- Grayscale images: Single channel 8-bit
- Color images: 3-channel RGB format
