Zhengyao Fang - Personal Homepage

About

I am a master’s student at the School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, expected to graduate in March 2027. I am advised by Prof. Wenjie Pei. My research interests lie in multimodal large models and image generation, editing, and understanding.

From May 2025 to March 2026, I worked at Tencent WeChat AI on image generation and evaluation. Since March 2026, I have been with the Tencent Hunyuan team, focusing on GUI-related research. If you are interested in my research, feel free to get in touch!

Research

My current research interests include:

GUI
OCR and scene text understanding
Image generation and editing
Vision–language models (VLMs / VLLMs)

Publications

2026

Zhengyao Fang, et al. “Too Vivid to Be Real? Benchmarking and Calibrating Generative Color Fidelity” CVPR 2026 (Highlight).
ArXiv · Code
Zhengyao Fang, et al. “Prune Redundancy, Preserve Essence: Vision Token Compression in VLMs via Synergistic Importance-Diversity” ICLR 2026.
ArXiv · Code 2025
Zhengyao Fang, et al. “Recognition-Synergistic Scene Text Editing” CVPR 2025.
ArXiv · Code 2024
Jingjing Wu, Zhengyao Fang, et al. “WeCromCL: Weakly Supervised Cross-Modality Contrastive Learning for Transcription-only Supervised Text Spotting” ECCV 2024.
ArXiv · Code

Contact

Email: zhengyaonineve@outlook.com
GitHub: @ZhengyaoFang
Google Scholar: Google Scholar