How to capture and decode MJPEG live stream "directly" from RTX 3080 TI GPU instead of CPU?

rajhlinux · January 19, 2025, 5:57am

Hello,

So here is my PC:
Windows 10.
GPU: RTX 3080 TI
CUDA 12.6
cuDNN 9.6
NVIDIA Video Codec SDK 12.2.72
Programming language: C++ 17 MSVC 17 2022

Before I ask some questions, please do not provide a cheap answer by saying to go use OpenCV library for simple image processing requirements, this can be done easily with some proper coding. Latency is also important.

Any how, I have a MJPEG live stream from a webcam.
It works perfectly when I test it using FFPLAY/FFMPEG via CLI.

Here is the live input MJPEG webcam stream details:
Stream #0:0: Video: mjpeg (Baseline) (MJPG / 0x47504A4D), yuvj422p(pc, bt470bg/bt709/unknown)

My question is, how can I directly decode this stream specifically and directly first from the GPU and not CPU? It is pretty obvious why I would like to do this, reduce latency and unnecessary CPU utilization.

I assume that the CPU would need to capture the frames and pass it the GPU? Is this possible?
Then the GPU decodes it and then passes it to CPU to store to file? This is possible?
I know that the GPU can also do DMA, but this seems more involved and will venture into this later until I figure out the fundamentals.

I want to offload CPU usage as much as possible. I’ll have about a dozen of cameras all running in parallel and need real time decoding.

I also checked the NVIDIA Video Codec SDK 12.2.72 sample for nvdecoder.cpp and header file and both doesn’t mention anything valuable about MJPEG, the parameter arguments for MJPEG are empty but simply listed there, makes no sense why.

Any advice would be much obliged. Thanks.

Edit:

Alright after doing some thorough research on this super rare complex topic of JPEG 100% GPU decoding. I’ve answered my own question.

It seems nvJPEG can decode MJPEG in a hybrid method (CPU + GPU) and not completely in GPU. But why???

After more research, this is because of the unfortunate Huffman algorithm used in JPEG which was created in a computing serially, thus not favorable in GPU computation and this component is taken care by CPU due to being more efficient computationally.

However there seems to be a paper which shows that CUDA can decode Huffman:

Anyhow, have no choice but to use what is currently available and use the nvJPEG hybrid. Someday I will learn the Huffman algo and read the paper and make my own CUDA 100% JPEG decoder with the help of advanced AI.

Topic		Replies	Views
Can CUDA decode MJPEG bitstream on linux? CUDA Programming and Performance	2	2992	May 15, 2012
JPEG decoding + VDPAU CUDA Programming and Performance	0	10879	February 15, 2011
Opening an image file direct from GPU CUDA Programming and Performance	1	1456	October 22, 2008
Video decoding of streaming JPEG 2000 ( MP2 ) ? Is CUDA appropriate processing for streaming video d CUDA Programming and Performance	4	5324	November 9, 2008
JPEG GPU decoder/shader CUDA Programming and Performance	0	13202	September 22, 2010
Can this be done? CUDA Video Processing CUDA Programming and Performance	1	3632	December 8, 2010
CUDA for real-time video processing? CUDA Programming and Performance	1	4313	April 24, 2007
video processing Advices CUDA Programming and Performance	2	1160	February 13, 2011
Real-time CUDA decoding H.264 to OpenGL CUDA Programming and Performance	0	1083	April 14, 2014
CUDA Video Decoding with Audio CUDA Programming and Performance	3	5640	October 10, 2010

How to capture and decode MJPEG live stream "directly" from RTX 3080 TI GPU instead of CPU?

Related topics