Skip to content

AIoT-MLSys-Lab/Efficient-Diffusion-Model-Survey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 

Repository files navigation

Efficient Diffusion Models: A Survey

ddl: September 2024

Abstract

This survey reviews the latest advancements in efficient diffusion model. This paper aims to provide a comprehensive overview of the methodologies, applications, and future directions in this burgeoning field.

Outline of Survey (20-25 pages)

  • [Abstract]
  • [Introduction]
  • [Main Content]
    • [Algorithm]
      • Efficient Sampling
        • Sampling Scheduling

          • Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
          • Parallel Sampling of Diffusion Models
          • Simple Hierarchical Planning with Diffusion
          • Accelerating Parallel Sampling of Diffusion Models
          • A Unified Sampling Framework for Solver Searching of Diffusion Probabilistic Models
          • PipeFusion: Displaced Patch Pipeline Parallelism for Inference of Diffusion Transformer Models
          • Deep Equilibrium Approaches to Diffusion Models
          • Learning to Efficiently Sample from Diffusion Probabilistic Models
          • On Fast Sampling of Diffusion Probabilistic Model
          • DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
        • Data-Dependent Adaptive Priors

          • PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior
          • DiGress: Discrete Denoising diffusion for graph generation
          • DECOMPDIFF: Diffusion Models with Decomposed Priors for Structure-Based Drug Design
          • Leapfrog diffusion model for stochastic trajectory prediction
        • Partial Sampling

          • On distillation of guided diffusion models
          • Snapfusion: Text-to-image diffusion model on mobile devices within two seconds
          • Consistent accelerated inference via confident adaptive transformers
          • Confident adaptive language modeling
          • A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models
          • Semi-parametric neural image synthesis
          • kNN-Diffusion: Image Generation via Large-Scale Retrieval
          • Re-Imagen: Retrieval-Augmented Text-to-Image Generator
          • ReDi: efficient learning-free diffusion inference via trajectory retrieval
      • Noise Schedule
        • Strategic Noise Schedules
          • Denoising Diffusion Probabilistic Models
          • Improved Denoising Diffusion Probabilistic Models
          • Imprvoed Noise Schedule for Diffusion Training
          • A Cheaper and Better Diffusion Language Model with Soft-masked Noise
        • Adaptive Noise Schedules
          • Denoising Diffusion Implicit Models
          • ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting
          • Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment
          • Text Diffusion Model with Encoder-Decoder Transformers for Sequence-to-Sequence Generation
      • SDE and ODE Solvers
        • SDE Solver
          • Diffusion Normalizing Flow
          • Gaussian Mixture Solvers for Diffusion Models
          • Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations
          • SA-Solver: Stochastic Adams Solver for Fast Sampling of Diffusion Models
          • Diffusion Models with Deterministic Normalizing Flow Priors
        • ODE Solver
          • Denoising diffusion implicit models
          • GDDIM: GENERALIZED DENOISING DIFFUSION IMPLICIT MODELS
          • DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps
          • FAST SAMPLING OF DIFFUSION MODELS WITH EXPONENTIAL INTEGRATOR
          • Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs
          • Denoising MCMC for Accelerating Diffusion-Based Generative Models
      • SGM Optimization
      • Latent Diffusion
      • Compression
        • Quantization
          • Post-Training Quantization
            • Post-training quantization on diffusion models
            • Q-diffusion: Quantizing diffusion models
            • Leveraging early-stage robustness in diffusion models for efficient and high-quality image synthesis
            • Ptqd: Accurate post-training quantization for diffusion models
          • Quantization-Aware Training
            • Temporal dynamic quantization for diffusion models
            • Efficientdm: Efficient quantization-aware fine-tuning of low-bit diffusion models
        • Pruning
          • Structural pruning for diffusion models
          • LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights
          • LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging
          • Laptop-diff: Layer pruning and normalized distillation for compressing diffusion models
        • Knowledge Distillation
          • Vector Field Distillation
            • Knowledge distillation in iterative generative models for improved sampling speed
            • Progressive distillation for fast sampling of diffusion models
            • On distillation of guided diffusion models
            • Consistency models
            • Flow straight and fast: Learning to generate and transfer data with rectified flow
            • Optimizing DDPM Sampling with Shortcut Fine-Tuning
            • Fast inference in denoising diffusion models via mmd finetuning
          • Generator Distillation
            • Nerf: Representing scenes as neural radiance fields for view synthesis
            • DreamFusion: Text-to-3D using 2D Diffusion
            • Prolificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation
            • Diff-instruct: A universal approach for transferring knowledge from pre-trained diffusion models
            • 3d paintbrush: Local stylization of 3d shapes with cascaded score distillation
    • [System]
      • Optimized Hardware-Software Co-Design
        • Speed is all you need: On-device acceleration of large diffusion models via gpu-aware optimizations
        • SDA: Low-Bit Stable Diffusion Acceleration on Edge FPGAs
        • A 28.6 mJ/iter Stable Diffusion Processor for Text-to-Image Generation with Patch Similarity-based Sparsity Augmentation and Text-based Mixed-Precision
        • Efficient memory management for large language model serving with pagedattention
        • Flightllm: Efficient large language model inference with a complete mapping flow on fpgas
      • Parallel Computing
      • Caching Technique
    • [Application]
      • Video Generation
  • [Evaluation]
  • [Conclusion]

Algorithm

Efficient Sampling

Sampling Scheduling

Data-Dependent Adaptive Priors

Partial Sampling

Noise Schedule

Strategic Noise Schedules

Adaptive Noise Schedules

SDE and ODE Solvers

SDE Solver

ODE Solver

Model Architecture Optimization

Diffusion Process Optimization

Pre-trained SGM Optimization

Solver-enhanced SGM

Latent Diffusion

Compression

Quantization

Post-Training Quantization
  • Post-training quantization on diffusion models
  • Q-diffusion: Quantizing diffusion models
  • Leveraging early-stage robustness in diffusion models for efficient and high-quality image synthesis
  • Ptqd: Accurate post-training quantization for diffusion models
Quantization-Aware Training
  • Temporal dynamic quantization for diffusion models
  • Efficientdm: Efficient quantization-aware fine-tuning of low-bit diffusion models

Pruning

  • Structural pruning for diffusion models
  • LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights
  • LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging
  • Laptop-diff: Layer pruning and normalized distillation for compressing diffusion models

Knowledge Distillation

Vector Field Distillation
  • Knowledge distillation in iterative generative models for improved sampling speed
  • Progressive distillation for fast sampling of diffusion models
  • On distillation of guided diffusion models
  • Consistency models
  • Flow straight and fast: Learning to generate and transfer data with rectified flow
  • Optimizing DDPM Sampling with Shortcut Fine-Tuning
  • Fast inference in denoising diffusion models via mmd finetuning
Generator Distillation
  • Nerf: Representing scenes as neural radiance fields for view synthesis
  • DreamFusion: Text-to-3D using 2D Diffusion
  • Prolificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation
  • Diff-instruct: A universal approach for transferring knowledge from pre-trained diffusion models
  • 3d paintbrush: Local stylization of 3d shapes with cascaded score distillation

Reference:

Efficient LLM survey

https://arxiv.org/pdf/2312.03863

Efficient Diffusion Model for Vision

https://arxiv.org/pdf/2210.09292

Diffusion Models: A Comprehensive Survey of Methods and Applications

https://arxiv.org/pdf/2209.00796

An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization

https://arxiv.org/pdf/2404.07771

Paper List

Topic

Efficient Sampling

  • SDE Solvers
  • ODE Solvers
  • Optimized Discretization
  • Truncated Diffusion
  • Knowledge Distillation

Undecided Papers

  1. LanguageFlow: Advancing Diffusion Language Generation with Probabilistic Flows, NAACL 24 [Paper]

    ODE-solver -> Using Recited Flow to replace ODE Dataset: E2E/NLG/ART

  2. Stable Target Field for Reduced Variance Score Estimation in Diffusion Models, ICLR 23 [Paper]

    Training Process Using STD to enhance SGMs, accelerating training process Dataset: CIFAR-10

  3. Learning Energy-Based Models by Cooperative Diffusion Recovery Likelihood, ICLR 24 [Paper]

    Cooperative Training Dataset: CIFAR-10/ImageNet/Celeb-A

  4. Autodiffusion: Training-free optimization of time steps and architectures for automated diffusion model acceleration, ICCV 23 [paper]

  5. Improving Training Efficiency of Diffusion Models via Multi-Stage Framework and Tailored Multi-Decoder Architecture, CVPR 24 [paper]

  6. DreamFusion: Text-to-3D using 2D Diffusion [paper]

  7. Fast Training of Diffusion Models with Masked Transformers, TMLR 24 [paper]

  8. MDTv2: Masked Diffusion Transformer is a Strong Image Synthesizer, ICCV 23 [paper]

  9. BK-SDM: A Lightweight, Fast, and Cheap Version of Stable Diffusion, ECCV24 [paper]

Algorithm Level

1-Efficient Sampling

1.1-Sampling Scheduling (And Mixing?)

  1. Align Your Steps: Optimizing Sampling Schedules in Diffusion Models, ICML 24 [Paper]

    Dataset: FFHQ/CIFAR-10/ImageNet/WebVid10M

  2. Parallel Sampling of Diffusion Models, NIPS 23 [Paper]

    Dataset: LSUN/Square/PushT/Franka Kitchen

  3. Simple Hierarchical Planning with Diffusion, ICLR 24 [Paper]

    Dataset: Maze2D/AntMaze/Gym-MuJoCo/FrankaKitchen

  4. Accelerating Parallel Sampling of Diffusion Models, ICML 24 [Paper]

    Dataset: ImageNet

  5. A Unified Sampling Framework for Solver Searching of Diffusion Probabilistic Models, ICLR 24 [Paper]

    Dataset: CIFAR-10/CelebA/ImageNet-64/LSUN-Bedroom

  6. PipeFusion: Displaced Patch Pipeline Parallelism for Inference of Diffusion Transformer Models, arXiv [Paper]

    Dataset: COCO Captions 2014

  7. Accelerating Guided Diffusion Sampling with Splitting Numerical Methods, ICLR 23 [Paper]

    Dataset: LSUN/FFHQ

  8. Diffusion Glancing Transformer for Parallel Sequence-to-Sequence Learning, NAACL 24 [Paper]

    Dataset: QQP/MS-COCO

  9. Deep Equilibrium Approaches to Diffusion Models, NIPS 22 [Paper]

    Dataset: CIFAR-10/CelebA/LSUN

  10. Effective Real Image Editing with Accelerated Iterative Diffusion Inversion, ICCV 23 [Paper]

    Dataset: AFHQ/COCO

1.2-Learned Posterior Sampling

  1. DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design, ICML 23 [Paper]

    Dataset: CrossDocked2020

  2. Diffusion Posterior Sampling for Linear Inverse Problem Solving: A Filtering Perspective, ICLR 24 [Paper]

    Dataset: FFHQ-1kvalidation/ImageNet-1k-validation

  3. Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process, CVPR 23 [Paper],

    Dataset: ShapeNet

1.3-Partial Sampling

  1. Leapfrog Diffusion Model for Stochastic Trajectory Prediction, CVPR 23 [Paper]

    Dataset: NBA/NFL/SDD/ETH-UCY

  2. SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds, NIPS 23 [Paper]

    Dataset: MS-COCO

  3. ReDi: Efficient Learning-Free Diffusion Inference via Trajectory Retrieval, ICML 23 [Paper]

    Dataset: MS-COCO

  4. Data-free Distillation of Diffusion Models with Bootstrapping, ICML 24 [Paper]

    Dataset: FFHQ/LSUN-Bedroom

  5. A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models, ICML 24 [Paper]

    Dataset: ImageNet/CelebA

  6. David helps Goliath: Inference-Time Collaboration Between Small Specialized and Large General Diffusion LMs, NAACL 24 [Paper]

    Dataset: DOLLY

  7. On Distillation of Guided Diffusion Models, CVPR 23 [Paper]

    Dataset: ImageNet/CIFAR-10

  8. Learning Stackable and Skippable LEGO Bricks for Efficient, Reconfigurable, and Variable-Resolution Diffusion Modeling, ICLR 24 [Paper]

    Dataset: CIFAR-10/ImageNet

  9. Relay Diffusion: Unifying diffusion process across resolutions for image synthesis, ICLR 24 [Paper]

    Dataset: CelebA-HQ/ImageNet

  10. Semi-Implicit Denoising Diffusion Models (SIDDMs), NIPS 23 [Paper]

    Dataset: CIFAR-10/CelebA-HQ/ImageNet

  11. Directly Fine-Tuning Diffusion Models on Differentiable Rewards, ICLR 24 [Paper]

    Dataset: LAION/HPDv2

  12. InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation, ICLR 24 [Paper]

    Dataset: MS COCO

  13. Fast Sampling of Diffusion Models via Operator Learning, ICML 23 [Paper]

    Dataset: CIFAR-10/ImageNet-64

2-Noise Schedule

2.1-Strategic Noise Schedules

  1. Denoising Diffusion Probabilistic Models, NIPS 20 [Paper]

  2. Improved Denoising Diffusion Probabilistic Models, PMLR 21 [Paper]

  3. Improved Noise Schedule for Diffusion Training, arxiv [Paper]

  4. A Cheaper and Better Diffusion Language Model with Soft-masked Noise, arxiv [Paper]

2.2-Adaptive Noise Schedules

  1. Denoising Diffusion Implicit Models, arxiv [Paper]

  2. ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting, NIPS 24 [Paper]

  3. Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment, arxiv [Paper]

  4. Text Diffusion Model with Encoder-Decoder Transformers for Sequence-to-Sequence Generation, ACL 24 [Paper]

2-Solver

2.1-SDE/ODE Theory

  1. Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions, ICLR 23 [Paper]

    Dataset: CIFAR-10/ImageNet 64x64

  2. Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs, ICML 23 [Paper]

    Dataset: CIFAR-10/ImageNet-32

  3. Gaussian Mixture Solvers for Diffusion Models, NIPS 23 [Paper]

    Dataset: CIFAR-10/ImageNet

  4. Denoising MCMC for Accelerating Diffusion-Based Generative Models, ICML 23 [Paper]

    Dataset: CIFAR11/CelebA-HQ-256/FFHQ-1024

  5. DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps, NIPS 22 [Paper]

    Dataset: CIFAR-10/CelebA/ImageNet/LSUN

  6. Score-Based Generative Modeling through Stochastic Differential Equations, ICLR 21 [Paper]

    Dataset: CIFAR-10/LSUN/CelebA-HQ

  7. Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations, ICML 24 [Paper]

    Dataset: text8/CIFAR-10

  8. Diffusion Normalizing Flow, NIPS 21 [Paper]

    Dataset: CIFAR-10/MNIST

  9. On the Trajectory Regularity of ODE-based Diffusion Sampling, ICML 24 [Paper]

    Dataset: LSUN Bedroom/CIFAR-10/ImageNet-64/FFHQ

3-SGM Optimization

  1. FP-Diffusion: Improving Score-based Diffusion Models by Enforcing the Underlying Score Fokker-Planck Equation, ICML 23 [Paper]

    Dataset:MNIST/Fashion MNIST/CIFAR-10/ImageNet32

  2. Accelerating Score-Based Generative Models with Preconditioned Diffusion Sampling, ECCV 22 [Paper]

    Dataset:MNIST/CIFAR-10/LSUN/FFHQ

  3. Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models, ICML 23 [Paper]

    Dataset:ImageNet/CIFAR-10/CelebA/FFHQ

  4. Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models, ICLR 22 [Paper]

    Dataset: CIFAR-10/ImageNet

  5. Discrete Predictor-Corrector Diffusion Models for Image Synthesis, ICLR 23 [Paper]

    Dataset: ImageNet/Places2

4-Latent Diffusion Optimization

  1. Fast Timing-Conditioned Latent Audio Diffusion, ICML 24 [Paper]

    Dataset: MusicCaps/AudioCaps

  2. AudioLDM: Text-to-Audio Generation with Latent Diffusion Models, ICML 23 [Paper]

    Dataset: AudioSet/AudioCaps/Freesound/BBC Sound Effect library

  3. Executing Your Commands via Motion Diffusion in Latent Space, CVPR 23 [Paper]

    Dataset: HumanML3D/KIT/AMASS/HumanAct12/UESTC

  4. Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition, ICLR 24 [Paper]

    Dataset: UCF-101/WebVid-10M/MSR-VTT

  5. Mixed-Type Tabular Data Synthesis with Score-based Diffusion in Latent Space, ICLR 24 [Paper]

    Dataset: Adult/Default/Shoppers/Magic/Faults/Beijing/News

  6. High-Resolution Image Synthesis With Latent Diffusion Models, CVPR 22 [Paper]

    Dataset: ImageNet/CelebA-HQ/FFHQ/LSUN-Churches/LSUN-Bedrooms

  7. Hyperbolic Geometric Latent Diffusion Model for Graph Generation, ICML 24 [Paper]

    Dataset: SBM/BA/Community/Ego/Barabasi-Albert/Grid/Cora/Citeseer/Polblogs/MUTAG/IMDB-B/PROTEINS/COLLAB

  8. Latent 3D Graph Diffusion, ICLR 24 [Paper]

    Dataset: ChEMBL/PubChemQC/QM9/Drugs

  9. PnP Inversion: Boosting Diffusion-based Editing with 3 Lines of Code, ICLR 24 [Paper]

    Dataset: PIE-Bench

  10. Cross-view Masked Diffusion Transformers for Person Image Synthesis, ICML 24 [Paper]

    Dataset: DeepFashion/ImageNet

  11. Towards Consistent Video Editing with Text-to-Image Diffusion Models, NIPS 23 [Paper]

    Dataset: DAVIS

  12. Video Probabilistic Diffusion Models in Projected Latent Space, CVPR 23 [Paper]

    Dataset: UCF101/SkyTimelapse

  13. Conditional Image-to-Video Generation With Latent Flow Diffusion Models, CVPR 23 [Paper],

    Dataset: MUG

  14. Diffusion Autoencoders: Toward a Meaningful and Decodable Representation, CVPR 22 [Paper]

    Dataset: FFHQ/CelebA-HQ

  15. Adapt and Diffuse: Sample-adaptive Reconstruction via Latent Diffusion Models, ICML 24 [Paper]

    Dataset: CelebA-HQ/LSUN-Bedrooms

  16. Dimensionality-Varying Diffusion Process, CVPR 23 [Paper]

    Dataset: CIFAR-10/LSUN-Bedroom/LSUN-Church/LSUN-Cat/FFHQ

  17. Vector Quantized Diffusion Model for Text-to-Image Synthesis, CVPR 22 [Paper]

    Dataset: CUB-200/Oxford-102/MSCOCO

5-Compression

5.1-Knowledge Distillation

  1. Bespoke Non-Stationary Solvers for Fast Sampling of Diffusion and Flow Models, ICML 24 [Paper]

    Dataset: MS-COCO/LibriSpeech/ImageNet-64

  2. GENIE: Higher-Order Denoising Diffusion Solvers, NIPS 22 [Paper]

    Dataset: CIFAR-10/LSUN,/ImageNet/AFHQv2

  3. Diffusion Probabilistic Model Made Slim, CVPR 23 [Paper]

    Dataset: ImageNet/MS-COCO

5.2-Quantization

  1. Post-Training Quantization on Diffusion Models, CVPR 23 [Paper]

    Dataset: ImageNet/CIFAR-10

  2. Q-Diffusion: Quantizing Diffusion Models, ICCV 23 [Paper]

    Dataset: CIFAR-10/LSUN Bedrooms/LSUN Church-Outdoor

  3. PTQD: Accurate Post-Training Quantization for Diffusion Models, NIPS 23 [Paper]

    Dataset: ImageNet/LSUN

  4. Binary Latent Diffusion, CVPR 23 [Paper]

    Dataset: LSUN Churches/FFHQ/CelebA-HQ/ImageNet-1K

  5. DiffFit: Unlocking Transferability of Large Diffusion Models via SimpleParameter-efficient Fine-Tuning, ICCV 23 [Paper]

    Dataset: ImageNet

  6. Würstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models, ICLR 24 [Paper]

    Dataset: COCO-30K

  7. Leveraging Early-Stage Robustness in Diffusion Models for Efficient and High-Quality Image Synthesis, NIPS 23 [Paper]

    Dataset: LSUN

5.3-Pruning

  1. Structural Pruning for Diffusion Models, NIPS 23 [Paper]

    Dataset: CIFAR-10/CelebA-HQ/LSUN/ImageNet

6-Better Design

6.1-Better Architecture

  1. Infinite Resolution Diffusion with Subsampled Mollified States, ICLR 24 [Paper]

    Dataset: FFHQ/LSUN Church/CelebA-HQ

  2. Fast Ensembling with Diffusion Schrödinger Bridge, ICLR 24 [Paper]

    Dataset: CIFAR-10/CIFAR-100/TinyImageNet

  3. Text Diffusion Model with Encoder-Decoder Transformers for Sequence-to-Sequence Generation, NAACL 24 [Paper]

    Dataset: QQP/Wiki-Auto/Quasar-T/CCD/IWSLT14/WMT14

6.2-Better Algorithm

  1. Neural Diffusion Processes, ICML 23 [Paper]

    Dataset: MNIST/CELEBA

  2. Score Regularized Policy Optimization through Diffusion Behavior, ICLR 24 [Paper]

    Benchmark: BEAR/TD3+BC/IQL

  3. Efficient and Degree-Guided Graph Generation via Discrete Diffusion Modeling, ICML 23 [Paper]

    Dataset: Community/Ego/Polblogs/Cora/Road-Minnesota/PPI/QM9

  4. Decomposed Diffusion Sampler for Accelerating Large-Scale Inverse Problems, ICLR 24 [Paper]

    Dataset: fastMRI knee/AAPM 256×256

  5. Soft Mixture Denoising: Beyond the Expressive Bottleneck of Diffusion Models, ICLR 24 [Paper]

    Dataset: CIGAR-10/LSUN-Conference

System Level

1-parallel computing

1.1-Distributed Parallel inference

  1. DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models, CVPR24 [paper]

  2. PipeFusion: Displaced Patch Pipeline Parallelism for Inference of Diffusion Transformer Models [paper]

  3. DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers [paper]

  4. SwiftDiffusion: Efficient Diffusion Model Serving with Add-on Modules [paper]

1.2-Distributed Parallel training

  1. DiffusionPipe: Training Large Diffusion Models with Efficient Pipelines, Arxiv [paper]

2-Specialized Hardware Design/Optimized Hardware-Software Co-Design

  1. Speed Is All You Need: On-Device Acceleration of Large Diffusion Models via GPU-Aware Optimizations, CVPR 23 [paper]

  2. SDA: Low-Bit Stable Diffusion Acceleration on Edge FPGAs, FPL 24 [paper]

  3. A 28.6 mJ/iter Stable Diffusion Processor for Text-toImage Generation with Patch Similarity-based Sparsity Augmentation and Text-based Mixed-Precision [paper]

3-Caching Technique

  1. Approximate Caching for Efficiently Serving Text-to-Image Diffusion Models, NSDI24 [paper]

  2. Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching [paper]

  3. DeepCache: Accelerating Diffusion Models for Free, CVPR 24 [paper]

Application Level

  1. DITTO: Diffusion Inference-Time T-Optimization for Music Generation, ICML 24 [Paper]

    Dataset: Wikifonia Lead-Sheet/MusicCaps

    Application Task: Text-to-Music

  2. Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer, arXiv [Paper]

    Dataset: HPDv2/DIV2K/LAION-5B/Datacomp

    Application Task: High-Resolution Image Generation

  3. Inserting Anybody in Diffusion Models via Celeb Basis, NIPS 23 [Paper]

    Dataset: LAION/StyleGAN

    Application Task: Personalized Image Generation

  4. Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models, NIPS 22 [Paper]

    Dataset: LSUN/Cityscapes

    Application Task: Image Editing

  5. DiffS2UT: A Semantic Preserving Diffusion Model for Textless Direct Speech-to-Speech Translation, EMNLP 23 [Paper]

    Dataset: VoxPopuli-S2S/Europarl-ST

    Application Task: Audio-to-Audio

  6. DiffIR: Efficient Diffusion Model for Image Restoration, ICCV 23 [Paper]

    Dataset: CelebA-HQ, LSUN Bedrooms, Places-Standard

    Application Task: Image Restoration

  7. Wavelet Diffusion Models Are Fast and Scalable Image Generators, CVPR 23[Paper]

    Dataset: CIFAR-10/STL-10/CelebA-HQ/LSUN-Church

    Application Task: Image Generation

  8. PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis, ICLR 24 [Paper]

    Dataset: LAION/SAM/JourneyDB

    Appllication Task: Text-to-Image Generation

  9. Non-autoregressive Conditional Diffusion Models for Time Series Prediction, ICML 23 [Paper]

    Dataset: NorPool/Caiso/Traffic/Electricity/Weather/Exchange/ETTh1/ETTm1/Wind

    Application Task: Time Series Prediction

  10. IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation, ICML 24 [Paper]

    Dataset: Objaverse Application Task: 3D-Object Generation

Benchmark

Dataset

Metric Strategy