SDIF-DA: A Shallow-to-Deep Interaction Framework with Data Augmentation for Multi-Modal Intent...

文章

学术资源搜索

获得 4 条结果（用时0.02秒）

我的图书馆

SDIF-DA: A Shallow-to-Deep Interaction Framework with Data Augmentation for Multi-Modal Intent...

在引用文章中搜索

[PDF] arxiv.org

Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time

S Chowdhury, S Nag, S Dasgupta, J Chen… - arXiv preprint arXiv …, 2024 - arxiv.org

Leveraging Large Language Models' remarkable proficiency in text-based tasks, recent
works on Multi-modal LLMs (MLLMs) extend them to other modalities like vision and audio …

MoBA: Mixture of Bi-directional Adapter for Multi-modal Sarcasm Detection

Y Xie, Z Zhu, X Chen, Z Chen, Z Huang - ACM Multimedia 2024, 2024 - openreview.net

In the field of multi-modal learning, model parameters are typically large, necessitating the
use of parameter-efficient fine-tuning (PEFT) techniques. These methods have been pivotal …

[PDF] openreview.net

InMu-Net: Advancing Multi-modal Intent Detection via Information Bottleneck and Multi-sensory Processing

Z Zhu, X Cheng, Z Chen, Y Chen, Y Zhang… - ACM Multimedia …, 2024 - openreview.net

Multi-modal intent detection (MID) aims to comprehend users' intentions through diverse
modalities, which has received widespread attention in dialogue systems. Despite the …

[PDF] researchsquare.com

Multimodal seed data augmentation for low-resource audio latin Cuengh languge

L Jiang, J Li, J Zhang, Y Shen - 2024 - researchsquare.com

Abstract The Latin Cuengh Language is a low-resource dialect prevalent in select ethnic
minority regions of China, presents unique challenges for intelligent research and …

高级搜索

QQ 群

SDIF-DA: A Shallow-to-Deep Interaction Framework with Data Augmentation for Multi-Modal Intent...

Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time

MoBA: Mixture of Bi-directional Adapter for Multi-modal Sarcasm Detection

InMu-Net: Advancing Multi-modal Intent Detection via Information Bottleneck and Multi-sensory Processing

Multimodal seed data augmentation for low-resource audio latin Cuengh languge

引用