Multimodal end-to-end sparse model for emotion recognition

Deep learning-based multimodal emotion recognition from audio, visual, and text modalities: A systematic review of recent advancements and future prospects

S Zhang, Y Yang, C Chen, X Zhang, Q Leng… - Expert Systems with …, 2023 - Elsevier

Emotion recognition has recently attracted extensive interest due to its significant
applications to human-computer interaction. The expression of human emotion depends on …

被引用次数：38 相关文章所有 2 个版本

[PDF] aclanthology.org

NusaCrowd: Open source initiative for Indonesian NLP resources

S Cahyawijaya, H Lovenia, AF Aji… - Findings of the …, 2023 - aclanthology.org

We present NusaCrowd, a collaborative initiative to collect and unify existing resources for
Indonesian languages, including opening access to previously non-public resources …

被引用次数：898 相关文章所有 7 个版本

[PDF] arxiv.org

One country, 700+ languages: NLP challenges for underrepresented languages and dialects in Indonesia

AF Aji, GI Winata, F Koto, S Cahyawijaya… - arXiv preprint arXiv …, 2022 - arxiv.org

NLP research is impeded by a lack of resources and awareness of the challenges presented
by underrepresented languages and dialects. Focusing on the languages spoken in …

被引用次数：55 相关文章所有 10 个版本

[PDF] core.ac.uk

面向深度学习的多模态情感识别研究进展.

赵小明，杨轶娇，张石清 - … of Frontiers of Computer Science & …, 2022 - search.ebscohost.com

多模态情感识别是指通过与人类情感表达相关的语音, 视觉, 文本等不同模态信息来识别人的
情感状态. 该研究在人机交互, 人工智能, 情感计算等领域有着重要的研究意义, 备受研究者关注 …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Negative object presence evaluation (nope) to measure object hallucination in vision-language models

H Lovenia, W Dai, S Cahyawijaya, Z Ji… - arXiv preprint arXiv …, 2023 - arxiv.org

Object hallucination poses a significant challenge in vision-language (VL) models, often
leading to the generation of nonsensical or unfaithful responses with non-existent objects …

被引用次数：23 相关文章所有 2 个版本

[PDF] mdpi.com

Multimodal emotion detection via attention-based fusion of extracted facial and speech features

D Mamieva, AB Abdusalomov, A Kutlimuratov… - Sensors, 2023 - mdpi.com

Methods for detecting emotions that employ many modalities at the same time have been
found to be more accurate and resilient than those that rely on a single sense. This is due to …

被引用次数：23 相关文章所有 8 个版本

[PDF] arxiv.org

Vision guided generative pre-trained language models for multimodal abstractive summarization

T Yu, W Dai, Z Liu, P Fung - arXiv preprint arXiv:2109.02401, 2021 - arxiv.org

Multimodal abstractive summarization (MAS) models that summarize videos (vision
modality) and their corresponding transcripts (text modality) are able to extract the essential …

被引用次数：66 相关文章所有 7 个版本

[PDF] arxiv.org

M-SENA: An integrated platform for multimodal sentiment analysis

H Mao, Z Yuan, H Xu, W Yu, Y Liu, K Gao - arXiv preprint arXiv …, 2022 - arxiv.org

M-SENA is an open-sourced platform for Multimodal Sentiment Analysis. It aims to facilitate
advanced research by providing flexible toolkits, reliable benchmarks, and intuitive …

被引用次数：38 相关文章所有 5 个版本

[PDF] aclanthology.org

A Facial Expression-Aware Multimodal Multi-task Learning Framework for Emotion Recognition in Multi-party Conversations

W Zheng, J Yu, R Xia, S Wang - … of the 61st Annual Meeting of the …, 2023 - aclanthology.org

Abstract Multimodal Emotion Recognition in Multiparty Conversations (MERMC) has
recently attracted considerable attention. Due to the complexity of visual scenes in multi …

被引用次数：7 相关文章所有 2 个版本

[PDF] mdpi.com

Enhancing speech emotion recognition using dual feature extraction encoders

I Pulatov, R Oteniyazov, F Makhmudov, YI Cho - Sensors, 2023 - mdpi.com

Understanding and identifying emotional cues in human speech is a crucial aspect of
human–computer communication. The application of computer technology in dissecting and …

被引用次数：8 相关文章所有 8 个版本

高级搜索

QQ 群