L Zhang, Y Zhang, X Zhao, Z Zou - Image and Vision Computing, 2021 - Elsevier
Image captioning is the task of generating captions of images in natural language. The
training typically consists of two phases, first minimizing the XE (cross-entropy) loss, and …