Inference in service worker can block webpage drawing for seconds

I'm passing a moderately large context (2k tokens) to WebLLM running in a service worker on Chrome (v142).

On both NVIDIA and MLX, it can stop the calling page from drawing using GPU (e.g., hardware-accelerated canvas) for multiple seconds, but 2-D operations like scrolling still do work. I presume this is some kind of underlying WebGPU bug (and Chromium should ultimately fix it), but also wonder if the WebLLM code could be structured in a way that would prevent it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Inference in service worker can block webpage drawing for seconds #750

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Inference in service worker can block webpage drawing for seconds #750

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions