J Liu, P Tang, W Wang, Y Ren,
X Hou, PA Heng… - arXiv preprint arXiv …, 2024 - arxiv.org
The emergence of large-scale Mixture of Experts (MoE) models has marked a significant
advancement in artificial intelligence, offering enhanced model capacity and computational …