M Li, L Zhang, M Zhu, Z Huang, G Yu, J Fan… - arXiv preprint arXiv …, 2024 - arxiv.org
This paper studies the problem of pre-training for small models, which is essential for many
mobile devices. Current state-of-the-art methods on this problem transfer the …