Y Ji,
C Fang, S Ma, H Shao, Z Wang - arXiv preprint arXiv:2407.12070, 2024 - arxiv.org
Transformer models have revolutionized AI tasks, but their large size hinders real-world
deployment on resource-constrained and latency-critical edge devices. While binarized …