G Ke,
D He,
TY Liu - arXiv preprint arXiv:2006.15595, 2020 - arxiv.org
In this work, we investigate the positional encoding methods used in language pre-training
(eg, BERT) and identify several problems in the existing formulations. First, we show that in …