Skip to content

Official code for the CVPR 2025 paper "SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models."

License

Notifications You must be signed in to change notification settings

ironjr/semantic-draw

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models

CVPR 2025

Previously StreamMultiDiffusion: Real-Time Interactive Generation
with Region-Based Semantic Control

mask result
Draw multiple prompt-masks in a large canvas Real-time creation

Jaerin Lee Β· Daniel Sungho Jung Β· Kanggeon Lee Β· Kyoung Mu Lee

Project ArXiv Github X LICENSE HFPaper

HFDemoMain HFDemo1 HFDemo2 HFDemo3 Open In Colab

SemanticDraw is a real-time interactive text-to-image generation framework that allows you to draw with meanings 🧠 using semantic brushes πŸ–ŒοΈ.


πŸš€ Quick Start

# Install conda create -n semdraw python=3.12 && conda activate semdraw git clone https://github.com/ironjr/semantic-draw cd semantic-draw pip install -r requirements.txt # Run streaming demo cd demo/stream python app.py --model "runwayml/stable-diffusion-v1-5" --port 8000 # Open http://localhost:8000 in your browser

For SD3 support, additionally run:

pip install git+https://github.com/initml/diffusers.git@clement/feature/flash_sd3

Note: this is default in requirements.txt


πŸ“š Table of Contents


⭐ Features

Interactive Drawing Prompt Separation Real-time Editing
usage1 usage2 usage3
Paint with semantic brushes No unwanted content mixing Edit photos in real-time

πŸ”§ Installation

Basic Installation

conda create -n smd python=3.12 && conda activate smd git clone https://github.com/ironjr/StreamMultiDiffusion cd StreamMultiDiffusion pip install -r requirements.txt

Stable Diffusion 3 Support

pip install git+https://github.com/initml/diffusers.git@clement/feature/flash_sd3

🎨 Demo Applications

We provide several demo applications with different features and model support:

1. StreamMultiDiffusion (Main Demo)

Real-time streaming interface with semantic drawing capabilities.

cd demo/stream python app.py --model "your-model" --height 512 --width 512 --port 8000
Options
Option Description Default
--model Path to SD1.5 checkpoint (HF or local .safetensors) None
--height Canvas height 768
--width Canvas width 1920
--bootstrap_steps Semantic region separation (1-3 recommended) 1
--seed Random seed 2024
--device GPU device number 0
--port Web server port 8000

2. Semantic Palette

Simplified interface for different SD versions:

SD 1.5 Version

cd demo/semantic_palette python app.py --model "runwayml/stable-diffusion-v1-5" --port 8000

SDXL Version

cd demo/semantic_palette_sdxl python app.py --model "your-sdxl-model" --port 8000

SD3 Version

cd demo/semantic_palette_sd3 python app.py --port 8000
Using Custom Models (.safetensors)
  1. Place your .safetensors file in the demo's checkpoints folder
  2. Run with: python app.py --model "your-model.safetensors"

πŸ’» Usage Examples

Python API

Basic Generation
import torch from model import StableMultiDiffusionPipeline # Initialize device = torch.device('cuda:0') smd = StableMultiDiffusionPipeline(device, hf_key='runwayml/stable-diffusion-v1-5') # Generate image = smd.sample('A photo of the dolomites') image.save('output.png')
Region-Based Generation
import torch from model import StableMultiDiffusionPipeline from util import seed_everything # Setup seed_everything(2024) device = torch.device('cuda:0') smd = StableMultiDiffusionPipeline(device) # Define prompts and masks prompts = ['background: city', 'foreground: a cat', 'foreground: a dog'] masks = load_masks() # Your mask loading logic # Generate image = smd(prompts, masks=masks, height=768, width=768) image.save('output.png')
Streaming Generation
from model import StreamMultiDiffusion # Initialize streaming pipeline smd = StreamMultiDiffusion(device, height=512, width=512) # Register layers smd.update_single_layer(idx=0, prompt='background', mask=bg_mask) smd.update_single_layer(idx=1, prompt='object', mask=obj_mask) # Stream generation while True: image = smd() display(image)

Jupyter Notebooks

Explore our notebooks directory for interactive examples:

  • Basic usage tutorial
  • Advanced region control
  • SD3 examples
  • Custom model integration

πŸ“– Documentation

Detailed Guides

Paper

For technical details, see our paper and project page.


πŸ™‹ FAQ

What is Semantic Palette?

Semantic Palette lets you paint with text prompts instead of colors. Each brush carries a meaning (prompt) that generates appropriate content in real-time.

Which models are supported?
  • βœ… Stable Diffusion 1.5 and variants
  • βœ… SDXL and variants (with Lightning LoRA)
  • βœ… Stable Diffusion 3
  • βœ… Custom .safetensors checkpoints
Hardware requirements?
  • Minimum: GPU with 8GB VRAM (for 512x512)
  • Recommended: GPU with 11GB VRAM (for larger resolutions) (Tested with 1080 ti).

🚩 Recent Updates

  • πŸ”₯ June 2025: Presented at CVPR 2025
  • βœ… June 2024: SD3 support with Flash Diffusion
  • βœ… April 2024: StreamMultiDiffusion v2 with responsive UI
  • βœ… March 2024: SDXL support with Lightning LoRA
  • βœ… March 2024: First version released

See README_old.md for full history.


🌏 Citation

@inproceedings{lee2025semanticdraw, title="{SemanticDraw:} Towards Real-Time Interactive Content Creation from Image Diffusion Models", author={Lee, Jaerin and Jung, Daniel Sungho and Lee, Kanggeon and Lee, Kyoung Mu}, booktitle={CVPR}, year={2025} }

πŸ€— Acknowledgements

Built upon StreamDiffusion, MultiDiffusion, and LCM. Special thanks to the Hugging Face team and the model contributors.


πŸ“§ Contact

Please email jarin.lee@gmail.com or open an issue.

About

Official code for the CVPR 2025 paper "SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models."

Topics

Resources

License

Stars

Watchers

Forks

Contributors 3

  •  
  •  
  •