K Han, Y Wang, H Chen, X Chen, J Guo, Z Liu… - arXiv e …, 2020 - ui.adsabs.harvard.edu
Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …