Multimodal healthcare AI: identifying and designing clinically relevant vision-language applications for radiology

N Yildirim, H Richardson, MT Wetscherek… - Proceedings of the CHI …, 2024 - dl.acm.org
Recent advances in AI combine large language models (LLMs) with vision encoders that
bring forward unprecedented technical capabilities to leverage for a wide range of …

[HTML][HTML] Opportunities and challenges in the application of large artificial intelligence models in radiology

L Pan, Z Zhao, Y Lu, K Tang, L Fu, Q Liang, S Peng - Meta-Radiology, 2024 - Elsevier
Influenced by ChatGPT, artificial intelligence (AI) large models have witnessed a global
upsurge in large model research and development. As people enjoy the convenience by this …

Advancing multimodal medical capabilities of Gemini

L Yang, S Xu, A Sellergren, T Kohlberger… - arXiv preprint arXiv …, 2024 - arxiv.org
Many clinical tasks require an understanding of specialized data, such as medical images
and genomics, which is not typically found in general-purpose large multimodal models …

Zero-shot ecg classification with multimodal learning and test-time clinical knowledge enhancement

C Liu, Z Wan, C Ouyang, A Shah, W Bai… - arXiv preprint arXiv …, 2024 - arxiv.org
Electrocardiograms (ECGs) are non-invasive diagnostic tools crucial for detecting cardiac
arrhythmic diseases in clinical practice. While ECG Self-supervised Learning (eSSL) …

LLM-RadJudge: Achieving Radiologist-Level Evaluation for X-Ray Report Generation

Z Wang, X Luo, X Jiang, D Li, L Qiu - arXiv preprint arXiv:2404.00998, 2024 - arxiv.org
Evaluating generated radiology reports is crucial for the development of radiology AI, but
existing metrics fail to reflect the task's clinical requirements. This study proposes a novel …

Dia-LLaMA: Towards Large Language Model-driven CT Report Generation

Z Chen, L Luo, Y Bie, H Chen - arXiv preprint arXiv:2403.16386, 2024 - arxiv.org
Medical report generation has achieved remarkable advancements yet has still been faced
with several challenges. First, the inherent imbalance in the distribution of normal and …

MAIRA-2: Grounded Radiology Report Generation

S Bannur, K Bouzid, DC Castro, A Schwaighofer… - arXiv preprint arXiv …, 2024 - arxiv.org
Radiology reporting is a complex task that requires detailed image understanding,
integration of multiple inputs, including comparison with prior imaging, and precise …

DeViDe: Faceted medical knowledge for improved medical vision-language pre-training

H Luo, Z Zhou, C Royer, A Sekuboyina… - arXiv preprint arXiv …, 2024 - arxiv.org
Vision-language pre-training for chest X-rays has made significant strides, primarily by
utilizing paired radiographs and radiology reports. However, existing approaches often face …

Merlin: A Vision Language Foundation Model for 3D Computed Tomography

L Blankemeier, JP Cohen, A Kumar… - arXiv preprint arXiv …, 2024 - arxiv.org
Over 85 million computed tomography (CT) scans are performed annually in the US, of
which approximately one quarter focus on the abdomen. Given the current radiologist …

FineRadScore: A Radiology Report Line-by-Line Evaluation Technique Generating Corrections with Severity Scores

A Huang, O Banerjee, K Wu, EP Reis… - arXiv preprint arXiv …, 2024 - arxiv.org
The current gold standard for evaluating generated chest x-ray (CXR) reports is through
radiologist annotations. However, this process can be extremely time-consuming and costly …