I am a Master's degree student at Wuhan University of Technology. I'm passionate about computer vision research, particularly video understanding. My work spans video Q&A, video-text retrieval, and video captioning. I also explore large language models, prompt engineering, operator development, knowledge graphs, and Q&A systems. My goal is to develop an affordable, secure, and trustworthy generalized multimodal video model for everyone.
Mingru Huang, Pengfei Duan, Yifang Zhang, Huimin Chen, Jiawang Peng, Shengwu Xiong
2025 Twenty-first International Conference on Intelligent Computing (ICIC 2025)
Introducing a Memory Enhanced Visual-Speech Aggregation model for dense video captioning, inspired by cognitive informatics on human memory recall. The model enhances visual representations by merging them with relevant text features retrieved from a memory bank through multimodal retrieval involving transcribed speech and visual inputs.
Jiawang Peng, Pengfei Duan, Mingru Huang, Shengwu Xiong
2025 Twenty-first International Conference on Intelligent Computing (ICIC 2025)
We reformulate label prediction as a progressive refinement process starting from an initial random guess, and propose LDiT (Label Diffusion Transformer) for pseudo-label noise adaptation. By modeling label uncertainty through a diffusion process, LDiT enables more robust learning under noisy supervision. In addition, to effectively capture the long-range dependencies in textual data, we adopt a Transformer-based latent denoising architecture with self-attention mechanisms.
Huimin Chen, Pengfei Duan, Mingru Huang, Jingyi Guo, Shengwu Xiong
2024 Twentieth International Conference on Intelligent Computing (ICIC 2024)
Proposing a new factorized spatio-temporal self-attention paradigm to address inaccurate event descriptions caused by insufficient temporal relationship modeling between video frames and apply it to dense video captioning tasks.
I'm always open to discussing research collaborations, new projects, or opportunities. Feel free to reach out!
© 2024 Mingru Huang. All rights reserved.
Template inspired by Keunhong Park