Cxr-clip: Toward large scale chest x-ray language-image pre-training

K You, J Gu, J Ham, B Park, J Kim, EK Hong… - … Conference on Medical …, 2023 - Springer
K You, J Gu, J Ham, B Park, J Kim, EK Hong, W Baek, B Roh
International Conference on Medical Image Computing and Computer-Assisted …, 2023Springer
A large-scale image-text pair dataset has greatly contributed to the development of vision-
language pre-training (VLP) models, which enable zero-shot or few-shot classification
without costly annotation. However, in the medical domain, the scarcity of data remains a
significant challenge for developing a powerful VLP model. In this paper, we tackle the lack
of image-text data in chest X-ray by expanding image-label pair as image-text pair via
general prompt and utilizing multiple images and multiple sections in a radiologic report. We …
Abstract
A large-scale image-text pair dataset has greatly contributed to the development of vision-language pre-training (VLP) models, which enable zero-shot or few-shot classification without costly annotation. However, in the medical domain, the scarcity of data remains a significant challenge for developing a powerful VLP model. In this paper, we tackle the lack of image-text data in chest X-ray by expanding image-label pair as image-text pair via general prompt and utilizing multiple images and multiple sections in a radiologic report. We also design two contrastive losses, named ICL and TCL, for learning study-level characteristics of medical images and reports, respectively. Our model outperforms the state-of-the-art models trained under the same conditions. Also, enlarged dataset improve the discriminative power of our pre-trained model for classification, while sacrificing marginal retrieval performance. Code is available at https://github.com/kakaobrain/cxr-clip.
Springer
以上显示的是最相近的搜索结果。 查看全部搜索结果