Vision-language models for medical report generation and visual question answering: A review

I Hartsock, G Rasool - arXiv preprint arXiv:2403.02469, 2024 - arxiv.org
Medical vision-language models (VLMs) combine computer vision and natural language
processing to analyze visual and textual medical data. Our paper reviews recent …

Masked vision and language pre-training with unimodal and multimodal contrastive losses for medical visual question answering

P Li, G Liu, J He, Z Zhao, S Zhong - International Conference on Medical …, 2023 - Springer
Medical visual question answering (VQA) is a challenging task that requires answering
clinical questions of a given medical image, by taking consider of both visual and language …

Overview of the ImageCLEF 2022: Multimedia retrieval in medical, social media and nature applications

B Ionescu, H Müller, R Péteri, J Rückert… - … Conference of the Cross …, 2022 - Springer
This paper presents an overview of the ImageCLEF 2022 lab that was organized as part of
the Conference and Labs of the Evaluation Forum–CLEF Labs 2022. ImageCLEF is an …

Self-supervised vision-language pretraining for medial visual question answering

P Li, G Liu, L Tan, J Liao… - 2023 IEEE 20th …, 2023 - ieeexplore.ieee.org
Medical image visual question answering (VQA) is a task to answer clinical questions, given
a radiographic image, which is a challenging problem that requires a model to integrate both …

Overview of ImageCLEFmedical 2023–caption prediction and concept detection

J Rückert, A Ben Abacha… - Working Notes of the …, 2023 - arodes.hes-so.ch
Résumé The 2023 ImageCLEFmedical GANs task is the first edition of this task, examining
the existing hypothesis that GANs (Generative Adversarial Networks) are generating …

[HTML][HTML] ROCOv2: Radiology Objects in COntext Version 2, an Updated Multimodal Image Dataset

J Rückert, L Bloch, R Brüngel, A Idrissi-Yaghir… - Scientific Data, 2024 - nature.com
Automated medical image analysis systems often require large amounts of training data with
high quality labels, which are difficult and time consuming to generate. This paper …

[PDF][PDF] Aueb nlp group at imageclefmedical caption 2022

F Charalampakos, G Zachariadis… - … Working Notes, CEUR …, 2022 - ceur-ws.org
We present the methods AUEB's NLP Group used to participate in the annual
ImageCLEFmedical Caption Task. The task comprises of the Concept Detection and the …

ImageCLEF 2023 highlight: multimedia retrieval in medical, social media and content recommendation applications

B Ionescu, H Müller, AM Drăgulinescu… - … on Information Retrieval, 2023 - Springer
In this paper, we provide an overview of the upcoming ImageCLEF campaign. ImageCLEF is
part of the CLEF Conference and Labs of the Evaluation Forum since 2003. ImageCLEF, the …

SciOL and MuLMS-Img: Introducing A Large-Scale Multimodal Scientific Dataset and Models for Image-Text Tasks in the Scientific Domain

T Tarsi, H Adel, JH Metzen, D Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
In scientific publications, a substantial part of the information is expressed via figures
containing images and diagrams. Hence, the retrieval of relevant figures given a natural …

[PDF][PDF] NeuralDynamicsLab at ImageCLEFmedical 2022.

G Moschovis, E Fransén - CLEF (Working Notes), 2022 - researchgate.net
Diagnostic Captioning is described as the automatic text generation from a collection of X-
RAY images and it can assist inexperienced doctors and radiologists to reduce clinical …