Lei Zhang

Chair Professor of Computer Vision and Image Analysis

Fellow of IEEE
Department of Computing
The Hong Kong Polytechnic University
Hung Hom, Kowloon, Hong Kong

Office: PQ816
Email: cslzhang at comp.polyu dot edu.hk

I am also with OPPO Research Institute.

Education

3/1998~10/2001

PhD

Dept. of Automatic Control, Northwestern Polytechnical University, Xi'an, China.

9/1995~3/1998

M.Sc

Dept. of Automatic Control, Northwestern Polytechnical University, Xi'an, China.

9/1991~7/1995

B.Sc

Dept. of Aeronautical Engineering, Shenyang Inst. of Aeronautical Engineering, Shenyang, China.


Work Experience

7/2017~present

Chair Professor, Dept. of Computing, Hong Kong Polytechnic University, Hong Kong.

7/2015~6/2017

Professor, Dept. of Computing, Hong Kong Polytechnic University, Hong Kong.

9/2010~6/2015

Associate Professor, Dept. of Computing, Hong Kong Polytechnic University, Hong Kong.

1/2006~8/2010

Assistant Professor, Dept. of Computing, Hong Kong Polytechnic University, Hong Kong.

1/2003~1/2006

Postdoctoral Fellow, Dept. of Electrical and Computer Engineering, McMaster University, Canada.

1/2001~1/2003

Research Assistant/Associate, Dept. of Computing, Hong Kong Polytechnic University, Hong Kong.


Visual Computing Lab (our mission):

Y learning and beyond: for future visual enhancement and understanding.

 

My Google Scholar Citation Profile:

http://scholar.google.com/citations?user=tAK5l1IAAAAJ


http://t3.gstatic.com/images?q=tbn:ANd9GcSHajD6zIxvR7ORoWo3YUt1I4QtdrnCXbMSavwRvV19gHyDytAfYgMC900297235[1]

Papers&Codes


News

1.    Several PhD Student positions jointly trained with OPPO Research Institute are available. The research topics include Image/Video Restoration/Enhancement, Image/Video Generation, LLM/VLM, Mobile MLLM, etc. Please send me your CV if you have interest.

2.    Several Postdoctoral Fellow or Research Associate positions on Image/Video Generation and Restoration, LLM/VLM, Visual Understanding are available. Please send me your CV if you have interest.

3.    Research Interns on Image/Video Enhancement, Image/Video Quality Assessment, Image/Video Generation, Unified Models, Mobile MLLM, etc., are available at OPPO Research Institute. Please send me your CV if you have interest.

Newly accepted

1.      S. Wang, S. Liu, Y. Kuang, X. Wei, Y. Liu, Z. Li, Y. Man, G. Chen, A. Tao, J. Kautz, G. Liu, L. Zhang, Z. Yu, "LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding," in ECCV 2026. (paper) (code) (Fast and Accurate Object Grounding: A New Paradigm!)

2.      X. Wei, J. Zhang, Z. Wang, H. Wei, Z. Guo, L. Zhang, "TIIF-Bench: How Does Your T2I Model Follow Your Instructions?" in ECCV 2026. (paper) (code) (To accurately evaluate T2I models' real performance!)

3.      L. Sun, R. Wu, Z. Zhang, R. Li, Y. Sun, S. Liu, L. Zhang, "Self-transcendence: Is External Feature Guidance Indispensable for Accelerating Diffusion Transformer Training?" in ECCV 2026. (paper) (code) (Do we really need pre-trained external feature representations to accelerate DiT training?)

4.      Z. Wang, K. Wang, L. Zhang, "PhyDetEx: Detecting and Explaining the Physical Plausibility of T2V Models," in ECCV 2026. (paper) (code) (Is the generated video physically plausible and why?)

5.      W. Zhu, Y. Zhang, L. Xu, X. Jin, W. Zeng, L. Zhang, "Dual Distribution Estimation for Noisy Test-Time Adaptation in Vision-Language Models," in ECCV 2026. (paper) (code)

6.      Z. Ma, W. Hu, W. Zhao, P. Wang, Y. Shan, L. Zhang, "OREO: Fidelity Alignment in 3D Generation via On-the-fly Rendering-Editing Optimization," in ECCV 2026. (paper) (code)

7.      Y. Meng, C. Wu, X. Liu, C. Guo, Z. Liang, L. Lei, J. Liang, H. Zeng, C. Li, L. Zhang, "FlowPainter: Inpainting Optical Flow via Confidence-Guided Completion," in ECCV 2026. (paper) (code)

8.      L. Qu, Y. Liu, S. Zhou, J. Liang, H. Zeng, L. Zhang, J. Yang, "There and Back Again: A Flexible-Frame Transformer for Multi-Exposure Fusion," in ECCV 2026. (paper) (code)

9.      Y. Liu, L. Qu, J. Liang, S. Zhou, H. Zeng, Y. Peng, H. Lin, L. Zhang, J. Yang, "ExpoMotion: A Large-Scale Benchmark and A Householder Projection Network for Multi-Exposure Fusion," in ECCV 2026. (paper) (code)

10.  J. Lou, K. Chen, W. You, H. Zeng, L. Zhang, S. Gu, "Perceiving Better Moments: Cover Frame Reselection and Enhancement for Live Photos with the Live2K Dataset," in ECCV 2026. (paper) (code)

11.  Y. Wu, C. Xie, R. Li, L. Chen, Q. Yi, L. Zhang, "CoCoEdit: Content-Consistent Image Editing via Region Regularized Reinforcement Learning," in ICML 2026. (paper) (code) (Edit the image as you instruct without changing the background details!)

12.  T. Wu, R. Li, L. Zhang, K. Ma, "Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis," in ICML 2026. (paper) (code) (Completely address the loss of diversity in DMD distillation!)

13.  G. Li, K. Cen, B. Zhao, Y. Xin, S. Luo, G. Zhai, L. Zhang, X. Liu, "LayerT2V: A Unified Multi-Layer Video Generation Framework," in ICML 2026. (paper) (code) (Generating videos with editable layers!)

Preprint

1.      C. Xie, Y. Wu, Q. Yi, L. Zhang, "Text-Vision Co-Instructed Image Editing," (paper) (code) (Textual and visual prompts together make your editing more accurate!)

2.      P. Wang, S. Wang, L. Chen, Z. Ma, G. Zhang, L. Zhang, "DepthMaster: Unified Monocular Depth Estimation for Perspective and Panoramic Images," preprint. (paper) (code) (SOTA results on perspective and panoramic images using a single model!)

3.      G. Qin, J. Zhang, C. He, J. Liang, T. Wu, Y. Jin, L. Zhang, "Tool-IQA: Augmenting Image Quality Assessment with Simple Tools," preprint. (paper) (code) (Magnifier and Gamma corrector are all tools you need to enhance your IQA model!)

4.      X. Kong, J. Zhao, L. Sun, R. Wu, L. Zhang, "GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration," preprint. (paper) (code) (Can multimodal foundation models be the solution for generalizable real-world image restoration?)

5.      R. Li, T. Yang, F. Ai, T. Wu, S. Wen, B. Peng, Lei Zhang, "Long-Horizon Streaming Video Generation via Hybrid Attention with Decoupled Distillation," preprint. (paper) (code) (Video generation at 29.5 FPS (832x480) on a single H100 GPU without quantization or model compression!)

6.      Y. Guo, Z. Zhang, P. Wang, X. Liang, Z. Ma, L. Zhang, "Memorize When Needed: Decoupled Memory Control for Spatially Consistent Long-Horizon Video Generation," preprint. (paper) (code) (Efficient training for spatially consistent long-horizon video generation!)

7.      W. Li, Z. Qi, Z. Zhao, K. Zhang, L. Zhang, "Weighted Reverse Convolution for Feature Upsampling," preprint. (paper) (code) (Making the features of vision foundation models stronger!)

8.      Z. Zheng, C. He, S. Wang, Y. Li, M. Cheng, L. Zhang, "DEL: Digit Entropy Loss for Numerical Learning of Large Language Models," preprint. (paper) (code) (A simple yet effective loss to improve the numerical learning capability of LLMs!)

9.      H. Wang, C. Shen, L. Zhang, Z. Cheng, "ATSS: Detecting AI-Generated Videos via Anomalous Temporal Self-Similarity," preprint. (paper) (code) (A highly effective algorithm to detect AI-generated videos!)

10.   J. Zhang, C. Xiao, A. Wu, X. Zhang, L. Zhang, "Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm," preprint. (paper) (code) (Can we train large-scale LLMs using GPUs with low memory? )

11.   Z. Wang, X. Wei, B. Li, Z. Guo, J. Zhang, H. Wei, K. Wang, L. Zhang, "VideoVerse: Does Your T2V Generator Have World Model Capability to Synthesize Videos?" preprint. (paper) (code) (To evaluate how strong your T2V model is!)

12.   X. Kong, R. Wu, S. Liu, L. Sun, L. Zhang, "NSARM: Next-Scale Autoregressive Modeling for Robust Real-World Image Super-Resolution," preprint. (paper) (code) (An efficient and robust AR model for real-world super-resolution!)