SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification X Miao, G Oliaro, Z Zhang, X Cheng, Z Wang, Z Zhang, RYY Wong, A Zhu, ... Proceedings of the 29th ACM International Conference on Architectural …, 2023 | 88* | 2023 |