M Kim, M Kim, J Bae, S Choi, S Kim… - arXiv preprint arXiv …, 2024 - arxiv.org
Hallucinations in vision-language models pose a significant challenge to their reliability,
particularly in the generation of long captions. Current methods fall short of accurately …