Zhiyuan You (尤志远)

🌐">

Zhiyuan You (尤志远)

I am a third-year ~~second-year~~ ~~first-year~~ Ph.D. student at Multimedia Laboratory (MMLab), The Chinese University of Hong Kong (CUHK). I am supervised by Prof. Tianfan Xue and Prof. Chao Dong. I also work closely with Dr. Jinjin Gu.

Prior to that, I received my M.Eng. from Shanghai Jiao Tong University (SJTU) in 2023, supervised by Prof. Xinyi Le and Prof. Yu Zheng, and my B.Eng. from the same university in 2020.

My current research interest mainly lies in low-level vision based on foundation models.

CV / Google Scholar / Github
Email: zhiyuanyou [at] foxmail [dot] com

News

[2025.11]    One paper to appear in TIP.
[2025.11]    End of my internship at Adobe — heartfelt thanks to all the teams.
[2025.09]    Two papers (One Datasets & Benchmarks Track) to appear in NeurIPS 2025.
[2025.08]    One paper to appear in TOG, SIGGRAPH Asia 2025 (Journal Track).
[2025.07]    We win the Championship 🏆 in the VQualA 2025 DIQA Challenge (ICCV 2025 Workshop).
[2025.03]    One paper to appear in TPAMI.
[2025.02]    Two papers to appear in CVPR 2025 (One Highlight). See you in Nashville.
[2025.01]    One paper to appear in ICLR 2025.
[2024.12]    One paper to appear in AAAI 2025 (Oral).
[2024.09]    One paper to appear in NeurIPS 2024 (Spotlight).
[2024.08]    One paper to appear in TMLR.
[2024.07]    One paper to appear in ECCV 2024. See you in Milan.
[2023.08]    I become a Ph.D. student at MMLab in The Chinese University of Hong Kong.
[2023.03]    I graduate and receive my Master's degree from Shanghai Jiao Tong University.
[2022.09]    One paper to appear in NeurIPS 2022 (Spotlight).
[2022.09]    One paper to appear in ICONIP 2023 (Oral).
[2022.08]    One paper to appear in WACV 2023 (Early Accept).
[2021.12]    One paper to appear in NN.
[2020.07]    I graduate and receive my Bachelor's degree from Shanghai Jiao Tong University.

[Show More]

Research

*: Equal Contribution, †: Corresponding Author

MLLM Applications in Low-level Vision / Generative Image Processing / Anomaly Detection / Misc / All

	DA-VAE: Plug-in Latent Compression for Diffusion via Detail Alignment Xin Cai, Zhiyuan You, Zhoutong Zhang, Tianfan Xue under review, 2025 paper We introduce Detail-Aligned VAE (DA-VAE), a method that increases the compression ratio of a pretrained VAE while requiring only lightweight adaptation for the pretrained diffusion backbone.
	PhotoFramer: Multi-modal Image Composition Instruction Zhiyuan You, Ke Wang, He Zhang, Xin Cai, Jinjin Gu, Tianfan Xue, Chao Dong, Zhoutong Zhang arXiv, 2025 paper / project page We introduce PhotoFramer, a multi-modal composition instruction framework, to provide composition guidance. Given a poorly composed image, PhotoFramer first describes how to improve the composition in natural language and then generates a well-composed example image.
	RadarQA: Multi-modal Quality Analysis of Weather Radar Forecasts Xuming He, Zhiyuan You, Junchao Gong, Couhua Liu, Xiaoyu Yue, Peiqin Zhuang, Wenlong Zhang†, Lei Bai† Neural Information Processing Systems (NeurIPS*), 2025 paper / code / data We introduce an MLLM-based weather forecast quality analysis method, RadarQA*, integrating key physical attributes with detailed assessment reports.
	ReinAD: Towards Real-world Industrial Anomaly Detection with a Comprehensive Contrastive Dataset Xu Wang, Jingyuan Zhuo, Zhiyuan You, Zhiyu Tan, Yikuan Yu, Siyu Wang, Xinyi Le† Neural Information Processing Systems (NeurIPS), 2025, Datasets & Benchmarks Track paper / code / data We introduce ReinAD dataset, a comprehensive, contrast-based, fine-grained, unaligned, and large-scale dataset towards real-world industrial anomaly detection.
	Harnessing Diffusion-Yielded Score Priors for Image Restoration Xinqi Lin, Fanghua Yu, Jinfan Hu, Zhiyuan You, Wu Shi, Jimmy S. Ren, Jinjin Gu†, Chao Dong† ACM Transactions on Graphics (TOG), ACM SIGGRAPH Asia, 2025, Journal Track paper / project page / code / online app / CCTV news reports (央视新闻直播间报道) We introduce HYPIR, a simple yet powerful paradigm: fine-tuning a pre-trained diffusion model with adversarial (GAN) loss — no diffusion sampling, no extra adapters, achieving an unprecedented balance of speed, fidelity, and quality.
	DeQA-Doc: Adapting DeQA-Score to Document Image Quality Assessment Junjie Gao, Runze Liu, Yingzhe Peng, Shujian Yang, Jin Zhang, Kai Yang†, Zhiyuan You† IEEE International Conference on Computer Vision Workshop (ICCVW), 2025 paper / code Our DeQA-Doc wins the Championship 🏆 in the VQualA 2025 DIQA (Document Image Quality Assessment) Challenge, ICCV 2025 Workshop.
	Teaching Large Language Models to Regress Accurate Image Quality Scores using Score Distribution Zhiyuan You, Xin Cai, Jinjin Gu, Tianfan Xue†, Chao Dong† IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2025 paper / project page / code / data We introduce DeQA-Score, a distribution-based depicted image quality assessment model for score regression.
	UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion Zixuan Chen, Yujin Wang, Xin Cai, Zhiyuan You, Zheming Lu, Fan Zhang, Shi Guo, Tianfan Xue† IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2025, Highlight, Best Demo Honorable Mention paper / project page / github page / code / demo (HuggingFace / OpenXLab) We propose UltraFusion, the first exposure fusion technique that can merge input with 9 stops differences.
	Interpreting Low-level Vision Models with Causal Effect Maps Jinfan Hu, Jinjin Gu, Shiyao Yu, Fanghua Yu, Zheyuan Li, Zhiyuan You, Chaochao Lu, Chao Dong† IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025 paper / code We propose Causal Effect Map (CEM), a model-agnostic and task-agnostic interpreting method for low-level vision models.
	An Intelligent Agentic System for Complex Image Restoration Problems Kaiwen Zhu, Jinjin Gu, Zhiyuan You, Yu Qiao, Chao Dong† International Conference on Learning Representations (ICLR), 2025 paper / project page / code We introduce AgenticIR, an LLM-based agentic system that utilize various tools for complex image restoration problems.
	SAIL: Sample-Centric In-Context Learning for Document Information Extraction Jinyu Zhang, Zhiyuan You, Jize Wang, Xinyi Le† Association for the Advancement of Artificial Intelligence (AAAI), 2025, Oral** paper / code We introduce SAIL, a sample-centric approach selecting tailored in-context examples for training-free document information extraction.
	Enhancing Descriptive Image Quality Assessment with A Large-scale Multi-modal Dataset Zhiyuan You, Jinjin Gu, Xin Cai, Zheyuan Li, Kaiwen Zhu, Chao Dong†, Tianfan Xue† IEEE Transactions on Image Processing (TIP), 2025 paper / project page / code / data We introduce DepictQA-Wild, also named Enhanced DepictQA (EDQA), a multi-functional in-the-wild descriptive image quality assessment model.
	PhoCoLens: Photorealistic and Consistent Reconstruction in Lensless Imaging Xin Cai, Zhiyuan You, Hailong Zhang, Wentao Liu, Jinwei Gu, Tianfan Xue† Neural Information Processing Systems (NeurIPS), 2024, Spotlight paper / project page / code We introduce PhoCoLens, a novel two-stage approach for consistent and photorealistic lensless image reconstruction.
	Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models Zhiyuan You, Zheyuan Li, Jinjin Gu, Zhenfei Yin, Tianfan Xue†, Chao Dong† European Conference on Computer Vision (ECCV), 2024 paper / project page / code / data We introduce DepictQA*, leveraging multi-modal large language models, allowing for detailed, language-based, and human-like evaluation of image quality.
	MaskMA: Towards Zero-Shot Multi-Agent Decision Making with Mask-Based Collaborative Learning Jie Liu, Yinmin Zhang, Chuming Li, Zhiyuan You, Zhanhui Zhou, Chao Yang, Yaodong Yang†, Yu Liu, Wanli Ouyang Transactions on Machine Learning Research (TMLR), 2024 paper We release MaskMA, a masked pretraining framework for multi-agent decision-making.
	Few-shot Object Counting with Similarity-Aware Feature Enhancement Zhiyuan You, Kai Yang, Wenhan Luo, Xin Lu, Lei Cui, Xinyi Le† Winter Conference on Applications of Computer Vision (WACV), 2023, Early Accept paper / code / video We propose a novel SAFECount block, equipped with a similarity comparison module and a feature enhancement module for few-shot object counting.
	A Unified Model for Multi-class Anomaly Detection Zhiyuan You, Lei Cui, Yujun Shen, Kai Yang, Xin Lu, Yu Zheng, Xinyi Le† Neural Information Processing Systems (NeurIPS), 2022, Spotlight paper / code We present UniAD that accomplishes anomaly detection for multiple classes with a unified framework.
	ADTR: Anomaly Detection Transformer with Feature Reconstruction Zhiyuan You, Kai Yang, Wenhan Luo, Lei Cui, Yu Zheng, Xinyi Le† International Conference on Neural Information Processing (ICONIP), 2022, Oral paper We propose ADTR to apply a transformer to reconstruct pre-trained features for anomaly detection, and propose novel losses to extend ADTR to anomaly-available case (both image-level & pixel-level labeled).
	An accurate star identification approach based on spectral graph matching for attitude measurement of spacecraft Zhiyuan You, Junzheng Li, Hongcheng Zhang, Bo Yang, Xinyi Le† Complex & Intelligent Systems (CAIS), 2022 paper / code My undergraduate thesis and my first first-author paper. We propose a novel star identification approach based on spectral graph matching.
	UTRAD: Anomaly detection and localization with u-transformer Liyang Chen, Zhiyuan You, Nian Zhang, Juntong Xi, Xinyi Le† Neural Networks (NN), 2022 paper / code We introduce UTRAD, a U-TRansformer based Anomaly Detection framework.

Experience

	Research Scientist Intern @ Adobe Jul. 2025 - Nov. 2025 Mentor: Dr. Ke Wang, Dr. Zhoutong Zhang, and Dr. He Zhang
	Ph.D. Student at Multimedia Laboratory (MMLab) @ The Chinese University of Hong Kong Aug. 2023 - Current Advisor: Prof. Tianfan Xue and Prof. Chao Dong
	M.Eng. with Honor @ Shanghai Jiao Tong University Sep. 2020 - Mar. 2023 GPA: 3.76 / 4.0 Advisor: Prof. Xinyi Le and Prof. Yu Zheng
	B.Eng. with Honor @ Shanghai Jiao Tong University Sep. 2016 - Jun. 2020 GPA: 89.36 / 100, Ranking: 5 / 148 Advisor: Prof. Xinyi Le

Services

• Journal Reviewer

    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
    International Journal of Computer Vision (IJCV)
    IEEE Transactions on Image Processing (TIP)
    IEEE Transactions on Multimedia (TMM)
    IEEE Transactions on Neural Networks and Learning Systems (TNNLS)
    IEEE Transactions on Systems, Man and Cybernetics: Systems (TSMCS)
    IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
    IEEE Transactions on Knowledge and Data Engineering (TKDE)
    Pattern Recognition (PR)
    Knowledge Based Systems (KBS)
    Engineering Applications of Artificial Intelligence (EAAI)
    Neural Networks (NN)
    Journal of Intelligent Manufacturing
    Journal of Selected Topics in Signal Processing
    Neurocomputing
    Measurement

• Conference Reviewer

    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, 2025, 2026
    International Conference on Computer Vision (ICCV), 2025
    Annual Conference on Neural Information Processing Systems (NeurIPS), 2024, 2025
    International Conference on Learning Representations (ICLR), 2025, 2026
    International Conference on Machine Learning (ICML), 2025
    Association for the Advancement of Artificial Intelligence (AAAI), 2026
    International Conference on Artificial Intelligence and Statistics (AISTATS), 2025, 2026
    IEEE Winter Conference on Applications of Computer Vision (WACV), 2026

Awards

    [2023.03]    SJTU Excellent Master Dissertation
    [2023.03]    SJTU Outstanding Graduate (Postgraduate)
    [2022.09]    National Scholarship
    [2020.06]    SJTU Outstanding Graduate (Undergraduate)

Template from JonBarron