Agriculture remains the basis of the country's economy, providing the main source of livelihood for most citizenry such as food, employment, income and foreign exchange as …
Decisions in agriculture are increasingly data-driven. However, valuable agricultural knowledge is often locked away in free-text reports, manuals and journal articles …
J Carter, A Goldstein, J Roy - 2024 - preprints.org
This paper introduces an innovative learning framework where linguistic representations are inherently grounded in visual perceptions, circumventing the need for predefined categorical …
The integration of multimodal data in retrieval applications, such as text with accompanying images on platforms like Wikipedia, has emerged as a critical area of research. The …
The manifestation of online hate speech has increasingly adopted a multimodal format, prominently featuring memes that combine both visual imagery and textual content …
Despite the considerable advancements achieved with machine learning techniques in identifying hate speech, numerous technical obstacles persist that hinder these models from …
Multimodal Conversation is a sophisticated vision-language task where an AI agent must engage in meaningful dialogues grounded in visual content. This requires a deep …
C Mitchell, L Hsu, D Fernández - 2025 - preprints.org
In recent years, Transformer architectures have been extensively applied to image captioning, achieving remarkable performance. The spatial and positional relationships …
The automated segmentation of brain gliomas from multimodal MRI scans is pivotal in both clinical trials and everyday medical practice. Manual segmentation, however, presents …