DEV Community

Xiao Ling
Xiao Ling

Posted on • Originally published at dynamsoft.com

How to Scan Thousands of Document Pages in Browser Without Crashing

Imagine this scenario: An operator at an insurance firm has just spent 20 minutes scanning a 500-page claim file. Suddenly, the browser freezes and crashes due to memory exhaustion. All that work is lost instantly. For developers building web-based document management systems, this is a nightmare. Browsers like Chrome and Edge have strict memory limits for tabs. When you try to hold hundreds of high-resolution images in memory (RAM), the browser tab will inevitably crash.

Dynamic Web TWAIN solves this problem elegantly with its Local Disk Storage feature, enabling you to build applications that can handle thousands of pages in a single session with zero data loss.

Demo Video: Local Disk Storage for Document Scanning

Prerequisites

The Problem: Browser Memory Limits

Web browsers are not designed to be heavy-duty image processing engines.

  • RAM Constraints: A standard browser tab has a memory ceiling (often around 2GB-4GB).
  • High-Res Images: A single A4 color scan at 300 DPI can be 25MB uncompressed. Scanning 100 pages can easily consume gigabytes of RAM.
  • The Result: As the user scans more pages, the application becomes sluggish and eventually crashes, wiping out all unsaved progress.

The Solution: Local Disk Caching

Dynamic Web TWAIN bypasses these browser limitations by intelligently caching scanned data to the local hard drive instead of keeping everything in the browser's volatile memory.

1. Unlimited Scanning Volume

By offloading image data to the local disk, your application is no longer bound by browser RAM limits. Users can scan 1,000, 2,000, or even 5,000 pages in a single batch. The only limit is the user's hard drive space.

2. Crash Recovery (Zero Data Loss)

This is a game-changer for high-volume workflows. Because images are saved to the disk immediately upon scanning:

  • Browser Crash? No problem.
  • Power Outage? No problem.
  • Accidental Tab Close? No problem.

When the user reopens the application, Dynamic Web TWAIN detects the existing session and automatically restores all previously scanned images. The user can resume exactly where they left off.

3. Enterprise-Grade Security

For industries like banking and healthcare, writing data to disk raises security concerns. Dynamic Web TWAIN addresses this with built-in encryption. The cached data on the local disk is encrypted, ensuring that sensitive documents remain secure and compliant (GDPR, HIPAA) even while stored temporarily on the client machine.

Step-by-Step Implementation Guide

Let's walk through how to implement Local Disk Storage from scratch.

Web TWAIN local disk storage

Step 1: Initialize the SDK

First, include the Dynamic Web TWAIN library and configure the basic settings:

<script src="https://cdn.jsdelivr.net/npm/dwt@latest/dist/dynamsoft.webtwain.min.js"></script> <script> Dynamsoft.DWT.ProductKey = "YOUR_LICENSE_KEY"; Dynamsoft.DWT.ResourcesPath = 'https://cdn.jsdelivr.net/npm/dwt@latest/dist'; Dynamsoft.DWT.Containers = [{ ContainerId: 'dwtcontrolContainer', Width: '100%', Height: '100%' }]; Dynamsoft.DWT.AutoLoad = true; </script> 
Enter fullscreen mode Exit fullscreen mode

Step 2: Track Storage State

Define variables to track the local storage UID and preference:

var localStoreUid; var storeName = 'DynamicWebTWAIN_LocalStorage'; var storeName_EnableAutoSaveStorage = 'DynamicWebTWAIN_EnableAutoSaveStorage'; 
Enter fullscreen mode Exit fullscreen mode

The localStoreUid is a unique identifier for the storage session. We use browser's localStorage (the standard key-value store) to persist this UID between page reloads.

Step 3: Initialize Storage Preference

On page load, check if the user had Local Storage enabled previously:

(function() { var checked = localStorage[storeName_EnableAutoSaveStorage]; if (checked === "false") { localStoreUid = ''; delete localStorage[storeName]; } else { // Default to enabled localStoreUid = localStorage[storeName]; localStorage[storeName_EnableAutoSaveStorage] = "true"; } })(); 
Enter fullscreen mode Exit fullscreen mode

Step 4: Restore Previous Session

When the SDK is ready, check if a previous session exists and restore it:

var DWTObject; Dynamsoft.DWT.RegisterEvent("OnWebTwainReady", function() { DWTObject = Dynamsoft.DWT.GetWebTwain('dwtcontrolContainer'); if (DWTObject && isStorageEnabled()) { restoreStorage(); // Restore images from disk // Auto-save on any buffer change DWTObject.RegisterEvent('OnBufferChanged', Dynamsoft_OnBufferChanged); } }); 
Enter fullscreen mode Exit fullscreen mode

Step 5: Implement Save/Restore Logic

Here's the core logic that handles both saving and restoring:

async function _saveOrRestoreStorage(bSave) { if (!isStorageEnabled()) return; try { // Check if storage already exists var ifExist = false; if (localStoreUid) { ifExist = await DWTObject.localStorageExist(localStoreUid); } if (ifExist && localStoreUid) { // Update or load from existing storage if (bSave) { await DWTObject.saveToLocalStorage({ uid: localStoreUid }); } else { await DWTObject.loadFromLocalStorage({ uid: localStoreUid }); } } else { // Create new storage localStoreUid = await DWTObject.createLocalStorage(); localStorage[storeName] = localStoreUid; // Persist UID await DWTObject.saveToLocalStorage({ uid: localStoreUid }); } } catch (ex) { console.error('Storage operation failed:', ex); } } async function saveStorage() { return _saveOrRestoreStorage(true); } async function restoreStorage() { return _saveOrRestoreStorage(false); } 
Enter fullscreen mode Exit fullscreen mode

Key Points:

  • DWTObject.createLocalStorage() creates an encrypted storage area on disk and returns a unique ID.
  • DWTObject.saveToLocalStorage({ uid }) writes the current image buffer to disk.
  • DWTObject.loadFromLocalStorage({ uid }) reads the images back into the buffer.
  • The UID is stored in browser's localStorage so we can reference the same disk storage after page reload.

Step 6: Auto-Save on Buffer Changes

Automatically save whenever images are added or removed:

function Dynamsoft_OnBufferChanged(p1) { if (isStorageEnabled() && p1) { // Skip internal operations if (p1.action === 'shift' || p1.action === 'filter') return; saveStorage(); // Persist to disk } } 
Enter fullscreen mode Exit fullscreen mode

The OnBufferChanged event fires whenever the image buffer is modified (scan, delete, etc.). We hook into this to keep the disk storage in sync.

Step 7: Cleanup on Disable

When the user disables Local Storage, clean up the disk storage:

function removeStorage() { DWTObject.UnregisterEvent('OnBufferChanged', Dynamsoft_OnBufferChanged); if (localStoreUid) { var _localStoreUid = localStoreUid; localStoreUid = ''; delete localStorage[storeName]; DWTObject.localStorageExist(_localStoreUid).then(function(ifExist) { if (ifExist) { DWTObject.removeLocalStorage({ uid: _localStoreUid }); } }); } } 
Enter fullscreen mode Exit fullscreen mode

Where are the Files Stored?

The encrypted cache files are stored in the Dynamic Web TWAIN service directory on the user's machine:

Windows:

C:\Program Files (x86)\Dynamsoft\Dynamic Web TWAIN Service {versionnumber}\storage 
Enter fullscreen mode Exit fullscreen mode

macOS:

/Users/{username}/Applications/Dynamsoft/Dynamic Web TWAIN Service {versionnumber}/storage 
Enter fullscreen mode Exit fullscreen mode

Linux:

/opt/dynamsoft/Dynamic Web TWAIN Service {versionnumber}/storage 
Enter fullscreen mode Exit fullscreen mode

Source Code

https://github.com/yushulx/web-twain-document-scan-management/tree/main/examples/local_storage

Top comments (0)