C Lu, J Zhang, Y Chu, Z Chen, J Zhou, F Wu… - arXiv e …, 2022 - ui.adsabs.harvard.edu
In the past few years, transformer-based pre-trained language models have achieved
astounding success in both industry and academia. However, the large model size and high …