Integrating text and image: Determining multimodal document intent in instagram posts

SM Park, YG Kim - IEEE access, 2022 - ieeexplore.ieee.org

Unlike previous studies on the Metaverse based on Second Life, the current Metaverse is
based on the social value of Generation Z that online and offline selves are not different …

被引用次数：1442 相关文章所有 6 个版本

[PDF] neurips.cc

The hateful memes challenge: Detecting hate speech in multimodal memes

D Kiela, H Firooz, A Mohan… - Advances in neural …, 2020 - proceedings.neurips.cc

This work proposes a new challenge set for multimodal classification, focusing on detecting
hate speech in multimodal memes. It is constructed such that unimodal models struggle and …

被引用次数：507 相关文章所有 6 个版本

Visual language integration: A survey and open challenges

SM Park, YG Kim - Computer Science Review, 2023 - Elsevier

With the recent development of deep learning technology comes the wide use of artificial
intelligence (AI) models in various domains. AI shows good performance for definite …

被引用次数：7 相关文章所有 2 个版本

[PDF] nscpolteksby.ac.id

Social media-based analysis of cultural ecosystem services and heritage tourism in a coastal region of Mexico

A Ghermandi, V Camacho-Valdez… - Tourism Management, 2020 - Elsevier

Understanding spatial patterns of visitation and benefits accrued to different types of natural
and cultural heritage tourists may have important implications for the sustainable …

被引用次数：131 相关文章所有 5 个版本

[PDF] mlr.press

The hateful memes challenge: Competition report

D Kiela, H Firooz, A Mohan… - NeurIPS 2020 …, 2021 - proceedings.mlr.press

Abstract Machine learning and artificial intelligence play an ever more crucial role in
mitigating important societal problems, such as the prevalence of hate speech. We describe …

被引用次数：61 相关文章所有 4 个版本

[PDF] thecvf.com

Fusing pre-trained language models with multimodal prompts through reinforcement learning

Y Yu, J Chung, H Yun, J Hessel… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Language models are capable of commonsense reasoning: while domain-specific
models can learn from explicit knowledge (eg commonsense graphs [6], ethical norms [25]) …

被引用次数：11 相关文章所有 4 个版本

[PDF] archive.org

Met-meme: A multimodal meme dataset rich in metaphors

B Xu, T Li, J Zheng, M Naseriparsa, Z Zhao… - Proceedings of the 45th …, 2022 - dl.acm.org

Memes have become the popular means of communication for Internet users worldwide.
Understanding the Internet meme is one of the most tricky challenges in natural language …

被引用次数：27 相关文章所有 2 个版本

[PDF] arxiv.org

Multimodal knowledge alignment with reinforcement learning

Y Yu, J Chung, H Yun, J Hessel, JS Park, X Lu… - arXiv preprint arXiv …, 2022 - arxiv.org

Large language models readily adapt to novel settings, even without task-specific training
data. Can their zero-shot capacity be extended to multimodal inputs? In this work, we …

被引用次数：26 相关文章所有 3 个版本

[PDF] aclanthology.org

MultiMET: A multimodal dataset for metaphor understanding

D Zhang, M Zhang, H Zhang, L Yang… - Proceedings of the 59th …, 2021 - aclanthology.org

Metaphor involves not only a linguistic phenomenon, but also a cognitive phenomenon
structuring human thought, which makes understanding it challenging. As a means of …

被引用次数：33 相关文章所有 3 个版本

[PDF] arxiv.org

Clue: Cross-modal coherence modeling for caption generation

M Alikhani, P Sharma, S Li, R Soricut… - arXiv preprint arXiv …, 2020 - arxiv.org

We use coherence relations inspired by computational models of discourse to study the
information needs and goals of image captioning. Using an annotation protocol specifically …

被引用次数：52 相关文章所有 5 个版本

高级搜索

QQ 群