Y Xin,
D Yang,
Y Zou - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org
In text-audio retrieval (TAR) tasks, due to the heterogeneity of contents between text and
audio, the semantic information contained in the text is only similar to certain frames within …