J Shi, L He, Y Wang, T He, B Wu, M Hou - arXiv preprint arXiv:2407.16958, 2024 - arxiv.org
Recent studies have shown that, relative position encoding performs well in selective state
space model scanning algorithms, and the architecture that balances SSM and Attention …