T Deußer, C Zhao, W Krämer, D Leonhard… - arXiv preprint arXiv …, 2023 - arxiv.org
During the pre-training step of natural language models, the main objective is to learn a
general representation of the pre-training dataset, usually requiring large amounts of textual …