Junwen Xiong


Hi, I am Junwen Xiong (熊俊文), a first-year Ph.D. student in the Department of Computer Science at Northwestern Polytechnical University, advised by Prof. Peng Zhang.

I'm broadly interested in multimodal learning (images, audio, video, etc.). My recent research lies in audio-visual speech separation, sound source localization.

Email  |  Google Scholar  |  Github

profile photo
News
  • [Mar. 2024] One paper DiffSal is accepted to CVPR'24.
  • [Feb. 2023] One paper about audio-visual saliency prediction is accepted to CVPR'23.
  • [Aug. 2022] One paper about multi-modal correlation learning is accepted by TMM'22.
Publication
DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction

Junwen Xiong, Peng Zhang, Tao You, Chuanyue Li, Wei Huang, Yufei Zha
CVPR, 2024 [paper] [webpage]
Generalized audio-visual saliency prediction framework
CASP-Net: Rethinking Video Saliency Prediction from an Audio-Visual Consistency Perceptual Perspective

Junwen Xiong, Ganglai Wang, Peng Zhang, Wei Huang, Yufei Zha, Guangtao Zhai
CVPR, 2023 [paper] [webpage]
Audio-visual consistency perception matters
Look&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement

Junwen Xiong, Yu Zhou, Peng Zhang, Lei Xie, Wei Huang, Yufei Zha
TMM, 2022 [paper] [webpage]
Unified correlation learning framework to solve two audio-visual tasks
Preprint
UniST: Towards Unifying Saliency Transformer for Video Saliency Prediction and Detection

Junwen Xiong, Peng Zhang, Chuanyue Li, Wei Huang, Yufei Zha, Tao You
[paper]
Is it possible to build a unified saliency model generalized to video saliency prediction and video salient object detection tasks? Sure!
FTFDNet: Learning to Detect Talking Face Video Manipulation with Tri-Modality Interaction

Ganglai Wang, Peng Zhang, Junwen Xiong, Feihan Yang, Wei Huang, Yufei Zha
[paper]
Incorporating three modalities to detect talking face video manipulation
Audio-visual speech separation based on joint feature representation with cross-modal attention

Junwen Xiong, Peng Zhang, Lei Xie, Wei Huang, Yufei Zha, Yanning Zhang
arXiv preprint, 2022 [paper]
Novel fusion methods for audio, video and optical flow modalities
Service

Journal Reviewing: Image and Vision Computing.
Conference Program Committees: ECCV 2024.