Lei Zhang's Homepage (HK-PolyU)

Lei Zhang

Chair Professor of Computer Vision and Image Analysis

Fellow of IEEE
Department of Computing
The Hong Kong Polytechnic University
Hung Hom, Kowloon, Hong Kong

Office: PQ816
Email: cslzhang at comp.polyu dot edu.hk

I am also with OPPO Research Institute.

Education

3/1998~10/2001	PhD	Dept. of Automatic Control, Northwestern Polytechnical University, Xi'an, China.
9/1995~3/1998	M.Sc	Dept. of Automatic Control, Northwestern Polytechnical University, Xi'an, China.
9/1991~7/1995	B.Sc	Dept. of Aeronautical Engineering, Shenyang Inst. of Aeronautical Engineering, Shenyang, China.

Work Experience

7/2017~present	Chair Professor, Dept. of Computing, Hong Kong Polytechnic University, Hong Kong.
7/2015~6/2017	Professor, Dept. of Computing, Hong Kong Polytechnic University, Hong Kong.
9/2010~6/2015	Associate Professor, Dept. of Computing, Hong Kong Polytechnic University, Hong Kong.
1/2006~8/2010	Assistant Professor, Dept. of Computing, Hong Kong Polytechnic University, Hong Kong.
1/2003~1/2006	Postdoctoral Fellow, Dept. of Electrical and Computer Engineering, McMaster University, Canada.
1/2001~1/2003	Research Assistant/Associate, Dept. of Computing, Hong Kong Polytechnic University, Hong Kong.

Visual Computing Lab (our mission):

Y learning and beyond: for future visual enhancement and understanding.

My Google Scholar Citation Profile:

http://scholar.google.com/citations?user=tAK5l1IAAAAJ


Papers&Codes

News

1. Several PhD Student positions jointly trained with OPPO Research Institute are available. The research topics include Image/Video Restoration/Enhancement, Image/Video Generation, LLM/VLM, Mobile MLLM, etc. Please send me your CV if you have interest.

2. Several Postdoctoral Fellow or Research Associate positions on Image/Video Generation and Restoration, LLM/VLM, Visual Understanding are available. Please send me your CV if you have interest.

3. Research Interns on Image/Video Enhancement, Image/Video Quality Assessment, Image/Video Generation, Unified Models, Mobile MLLM, etc., are available at OPPO Research Institute. Please send me your CV if you have interest.

Newly accepted

1. L. Sun, R. Wu, J. Liang, Z. Zhang, H. Yong, L. Zhang, "Improving the Stability and Efficiency of Diffusion Models for Content Consistent Super-Resolution," in IEEE Trans. on Image Processing, 2026. (paper) (code)

2. G. Zhang, C. He, L. Chen, L. Zhang, "BEVDilation: LiDAR-Centric Multi-Modal Fusion for 3D Object Detection," in AAAI 2026. (paper) (code) (New SOTA in 3D object detection!)

3. L. Chen, R. Li, G. Zhang, P. Wang, L. Zhang, "Fast Multi-view Consistent 3D Editing with Video Priors," in AAAI 2026. (paper) (code) (Fast and consistent 3D editing!)

4. X. Liang, Z. Ma, L. Sun, Y. Guo, L. Zhang, "AlignCVC: Aligning Cross-View Consistency for Single-Image-to-3D Generation," in AAAI 2026. (paper) (code) (A new pipeline for single-image-to-3D generation!)

5. Z. Zhang, R. Wu, L. Sun, L. Zhang, "GPSToken: Gaussian Parameterized Spatially-adaptive Tokenization for Image Representation and Generation," in NeurIPS 2025. (paper) (code) (Break the limitation of tokenization with fixed grid!)

6. R. Wu, L. Sun, Z. Zhang, S. Wang, T. Wu, Q. Yi, S. Li, L. Zhang, "DP²O-SR: Direct Perceptual Preference Optimization for Real-World Image Super-Resolution," in NeurIPS 2025. (paper) (code) (How can DPO help Real-ISR!)

7. Y. Sun, L. Sun, S. Liu, R. Wu, Z. Zhang, L. Zhang, "One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution," in NeurIPS 2025. (paper) (code) (Address the dilemma of VSR in one-step diffusion via dual LoRA learning!)

8. S. Liu, J. Ma, L. Sun, X. Kong, L. Zhang, "InstructRestore: Region-Customized Image Restoration with Human Instructions," in NeurIPS 2025. (paper) (code) (Restore the image as you wish!)

9. C. Xie, M. Li, S. Li, Y. Wu, Q. Yi, L. Zhang, "DNAEdit: Direct Noise Alignment for Text-Guided Rectified Flow Editing," in NeurIPS 2025 (Spotlight). (paper) (code) (High quality editing with accurate background preservation!)

10. T. Wu, J. Zou, J. Liang, L. Zhang, K. Ma, "VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank," in NeurIPS 2025 (Spotlight). (paper) (code) (A strong no-reference quality assessment model with reasoning!)

11. B. Dong, M. Ni, Z. Huang, G. Yang, W. Zuo, L. Zhang, "MIRAGE: Assessing Hallucination in Multimodal Reasoning Chains of MLLM," in NeurIPS 2025. (paper) (code) (How much hallucination in MLLM reasoning?)

12. W. Lin, X. Wei, R. An, T. Ren, T. Chen, R. Zhang, Z. Guo, W. Zhang, L. Zhang, H. Li, "Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos," in NeurIPS 2025. (paper) (code) (A perceive anything model with large-scale dataset!)

13. Z. Zhao, C. Xiao, H. Lin, Q. Xie, L. Zhang, D. Meng, "Polyline Path Masked Attention for Vision Transformer," in NeurIPS 2025 (Spotlight). (paper) (code)

14. L. Qu, Z. Liu, S. Zhou, Y. Luo, J. Liang, H. Zeng, L. Zhang, J. Yang, "BurstDeflicker: A Benchmark Dataset for Flicker Removal in Dynamic Scenes," in NeurIPS 2025 (Datasets and Benchmarks Track). (paper) (data&ode)

Preprint

1. X. Wei, K. Cen, H. Wei, Z. Guo, B. Li, Z. Wang, J. Zhang, L. Zhang, "MICo-150K: A Comprehensive Dataset Advancing Multi-Image Composition," preprint. (paper) (code) (An elaborately constructed dataset and a strong baseline model for multi-image composition!)

2. X. Liang, Z. Ma, L. Sun, Y. Guo, L. Zhang, "Photo3D: Advancing Photorealistic 3D Generation through Structure‑Aligned Detail Enhancement," preprint. (paper) (code) (To make 3D generation results more realistic!)

3. Z. Wang, K. Wang, L. Zhang, "PhyDetEx: Detecting and Explaining the Physical Plausibility of T2V Models," preprint. (paper) (code) (Is the generated video physically plausible and why?)

4. Z. Wang, X. Wei, B. Li, Z. Guo, J. Zhang, H. Wei, K. Wang, L. Zhang, "VideoVerse: How Far is Your T2V Generator from a World Model?" preprint. (paper) (code) (To evaluate how strong your T2V model is!)

5. X. Kong, R. Wu, S. Liu, L. Sun, L. Zhang, "NSARM: Next-Scale Autoregressive Modeling for Robust Real-World Image Super-Resolution," preprint. (paper) (code) (An efficient and robust AR model for real-world super-resolution!)

6. X. Wei, J. Zhang, Z. Wang, H. Wei, Z. Guo, L. Zhang, "TIIF-Bench: How Does Your T2I Model Follow Your Instructions?" preprint. (paper) (code) (To accurately evaluate T2I models' real performance!)

7. W. Zhu, Y. Zhang, X. Jin, W. Zeng, L. Zhang, "ANTS: Shaping the Adaptive Negative Textual Space by MLLM for OOD Detection," preprint. (paper) (code) (Can MLLM help OOD detection?)

8. R. Cui, L. Zhang, "UNICE: Training A Universal Image Contrast Enhancer," preprint. (paper) (code) (A unified model for various image contrast enhancement tasks!)

9. S. Wang, G. Chen, D. Huang, Z. Li, M. Li, G. Li, J.M. Alvarez, L. Zhang, Z. Yu, "VideoITG: Improving Multimodal Video Understanding with Instructed Temporal Grounding," preprint. (paper) (code) (A plug and play approach to improve video understanding tasks!)

10. T. Yang, R. Li, Y. Shi, Y. Zhang, Q. Dong, H. Cheng, W. Feng, S. Wen, B. Peng, L. Zhang, "Many-for-Many: Unify the Training of Multiple Video and Image Generation and Manipulation Tasks," preprint. (paper) (code) (One model, many tasks!)