Skip to content

πŸ“¦ Out-Of-The-Box & Powerful File Parsing Service, support Text/Pdf/Docx/Pptx/Xlsx/Image/Audio parsing, support OCR, support Base64/Local/S3/R2/TG/MinIO storage.

License

Notifications You must be signed in to change notification settings

taamsoftadmin/blob-service

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

24 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“¦ Chat Nio Blob Service

File Service for Chat Nio

Deploy to Vercel

Supported File Types

  • Text
  • Image (require vision models)
  • Audio (require Azure Speech to Text Service)
  • Docx (not support .doc)
  • Pdf
  • Pptx (not support .ppt)
  • Xlsx (support .xls)

Run

pip install -r requirements.txt uvicorn main:app --reload

Then the service will be running on http://localhost:8000

Deploy

uvicorn main:app

Using Docker

Image: programzmh/chatnio-blob-service

docker run -p 8000:8000 programzmh/chatnio-blob-service # with environment variables # docker run -p 8000:8000 -e AZURE_SPEECH_KEY="..." -e AZURE_SPEECH_REGION="..." programzmh/chatnio-blob-service # if you are using `local` storage type, you need to mount volume (/static) to the host # docker run -p 8000:8000 -v /path/to/static:/static programzmh/chatnio-blob-service

API

POST /upload Upload a file

{ "file": "file" }

Response

{ "status": true, "content": "...", "error": "" }

Environment Variables

🎨 General Config

  • MAX_FILE_SIZE: Max Uploaded File Size MiB (Default: No Limit)
    • Tips: Size limit is also depend on the server configuration (e.g. Nginx/Apache Config, Vercel Free Plan Limit 5MB Body Size)
  • CORS_ALLOW_ORIGINS: CORS Allow Origins (Default: *)
  • AZURE_SPEECH_KEY: Azure Speech to Text Service Key (Required for Audio Support)
  • AZURE_SPEECH_REGION: Azure Speech to Text Service Region (Required for Audio Support)

πŸ” OCR Config

OCR Support is based on PaddleOCR API, please deploy the API to use OCR feature.

When OCR is enabled, the service will automatically extract text from the image and skip the original image storage solution below.

πŸ–Ό Image Storage Config

  1. ✨ No Storage (Default)

    • No Storage Required & No External Dependencies
    • Base64 Encoding/Decoding
    • Support Serverless Deployment Without Storage (e.g. Vercel)
  2. πŸ“ Local Storage

    • Require Server Environment (e.g. VPS, Docker)
    • Support Direct URL Access
    • Payless Storage Cost
    • Config:
      • set env STORAGE_TYPE to local (e.g. STORAGE_TYPE=local)
      • set env LOCAL_STORAGE_DOMAIN to your deployment domain (e.g. LOCAL_STORAGE_DOMAIN=http://blob-service.onrender.com)
      • if you are using Docker, you need to mount volume /static to the host (e.g. -v /path/to/static:/static)
  3. πŸš€ AWS S3

    • Payment Storage Cost
    • Support Direct URL Access
    • China Mainland User Friendly
    • Config:
      • set env STORAGE_TYPE to s3 (e.g. STORAGE_TYPE=s3)
      • set env S3_ACCESS_KEY to your AWS Access Key ID
      • set env S3_SECRET_KEY to your AWS Secret Access Key
      • set env S3_BUCKET to your AWS S3 Bucket Name
      • set env S3_REGION to your AWS S3 Region
  4. πŸ”” Cloudflare R2

    • Free Storage Quota (10GB Storage & Zero Outbound Cost)
    • Support Direct URL Access
    • Config (S3 Compatible):
      • set env STORAGE_TYPE to s3 (e.g. STORAGE_TYPE=s3)
      • set env S3_ACCESS_KEY to your Cloudflare R2 Access Key ID
      • set env S3_SECRET_KEY to your Cloudflare R2 Secret Access Key
      • set env S3_BUCKET to your Cloudflare R2 Bucket Name
      • set env S3_DOMAIN to your Cloudflare R2 Domain Name (e.g. https://<account-id>.r2.cloudflarestorage.com)
      • set env S3_DIRECT_URL_DOMAIN to your Cloudflare R2 Public URL Access Domain Name (Open Public URL Access, e.g. https://pub-xxx.r2.dev)
  5. πŸ“¦ Min IO

    • Self Hosted
    • Reliable & Flexible Storage
    • Config (S3 Compatible):
      • set env STORAGE_TYPE to s3 (e.g. STORAGE_TYPE=s3)
      • set env S3_SIGN_VERSION to s3v4 (e.g. S3_SIGN_VERSION=s3v4)
      • set env S3_ACCESS_KEY to your Min IO Access Key ID
      • set env S3_SECRET_KEY to your Min IO Secret Access Key
      • set env S3_BUCKET to your Min IO Bucket Name
      • set env S3_DOMAIN to your Min IO Domain Name (e.g. https://oss.example.com)
      • [Optional] If you are using CDN, you can set S3_DIRECT_URL_DOMAIN to your Min IO Public URL Access Domain Name (e.g. https://cdn-hk.example.com)
  6. ❀ Telegram CDN

    • Free Storage (Rate Limit)
    • Support Direct URL Access (China Mainland User Unfriendly)
    • Config:
      • set env STORAGE_TYPE to tg (e.g. STORAGE_TYPE=tg)
      • set env TG_ENDPOINT to your TG-STATE Endpoint (e.g. TG_ENDPOINT=https://tgstate.vercel.app)
      • [Optional] if you are using password authentication, you can set TG_PASSWORD to your TG-STATE Password

About

πŸ“¦ Out-Of-The-Box & Powerful File Parsing Service, support Text/Pdf/Docx/Pptx/Xlsx/Image/Audio parsing, support OCR, support Base64/Local/S3/R2/TG/MinIO storage.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 59.9%
  • HTML 37.3%
  • Shell 2.1%
  • Dockerfile 0.7%