B Zhao,
M Gong,
X Li - IEEE Transactions on Neural Networks …, 2021 - ieeexplore.ieee.org
Audio and vision are two main modalities in video data. Multimodal learning, especially for
audiovisual learning, has drawn considerable attention recently, which can boost the …