Skip to content

Add KV Cache for Autoregressive Inference #12600

@DN6

Description

@DN6

Autoregressive Diffusion Techniques such as Self Forcing rely on a rolling KV Cache across video frame chunks to transfer information from past context frames to the current frames being denoised.

This rolling KV Cache design (or variants similar to it) is likely to show up in other types of long video generation/ world models, so it would be good to see if we can support it natively in Diffusers.

Tasks

  • Implement rolling KV Cache seen in Self Forcing using Diffusers' cache hooks design.
  • Add a Modular Block to Wan Modular Pipelines that uses this rolling KV Cache to perform autoregressive inference.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions