A Rogers,
O Kovaleva, A Rumshisky - Transactions of the Association …, 2021 - direct.mit.edu
Transformer-based models have pushed state of the art in many areas of NLP, but our
understanding of what is behind their success is still limited. This paper is the first survey of …