Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality WL Chiang, Z Li, Z Lin, Y Sheng, Z Wu, H Zhang, L Zheng, S Zhuang, ... See https://vicuna. lmsys. org (accessed 14 April 2023) 2 (3), 6, 2023 | 1374* | 2023 |
Judging llm-as-a-judge with mt-bench and chatbot arena L Zheng, WL Chiang, Y Sheng, S Zhuang, Z Wu, Y Zhuang, Z Lin, Z Li, ... Advances in Neural Information Processing Systems 36, 2024 | 1236* | 2024 |
Efficient memory management for large language model serving with pagedattention W Kwon, Z Li, S Zhuang, Y Sheng, L Zheng, CH Yu, J Gonzalez, H Zhang, ... Proceedings of the 29th Symposium on Operating Systems Principles, 611-626, 2023 | 393 | 2023 |
cvc5: A versatile and industrial-strength SMT solver H Barbosa, C Barrett, M Brain, G Kremer, H Lachnitt, M Mann, ... International Conference on Tools and Algorithms for the Construction and …, 2022 | 342 | 2022 |
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU Y Sheng, L Zheng, B Yuan, Z Li, M Ryabinin, B Chen, P Liang, C Re, ... International Conference on Machine Learning, 2023 | 167 | 2023 |
H2o: Heavy-hitter oracle for efficient generative inference of large language models Z Zhang, Y Sheng, T Zhou, T Chen, L Zheng, R Cai, Z Song, Y Tian, C Ré, ... Advances in Neural Information Processing Systems 36, 2024 | 81 | 2024 |
How Long Can Context Length of Open-Source LLMs truly Promise? D Li, R Shao, A Xie, Y Sheng, L Zheng, J Gonzalez, I Stoica, X Ma, ... NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following, 2023 | 73* | 2023 |
{AlpaServe}: Statistical multiplexing with model parallelism for deep learning serving Z Li, L Zheng, Y Zhong, V Liu, Y Sheng, X Jin, Y Huang, Z Chen, H Zhang, ... 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023 | 68 | 2023 |
Lmsys-chat-1m: A large-scale real-world llm conversation dataset L Zheng, WL Chiang, Y Sheng, T Li, S Zhuang, Z Wu, Y Zhuang, Z Li, ... arXiv preprint arXiv:2309.11998, 2023 | 39* | 2023 |
Chatbot arena: An open platform for evaluating llms by human preference WL Chiang, L Zheng, Y Sheng, AN Angelopoulos, T Li, D Li, H Zhang, ... arXiv preprint arXiv:2403.04132, 2024 | 38 | 2024 |
Subspace embedding and linear regression with orlicz norm A Andoni, C Lin, Y Sheng, P Zhong, R Zhong International Conference on Machine Learning, 224-233, 2018 | 38 | 2018 |
Distribution-free junta testing X Chen, Z Liu, RA Servedio, Y Sheng, J Xie STOC 2018, 2018 | 31* | 2018 |
SLoRA: Scalable Serving of Thousands of LoRA Adapters Y Sheng, S Cao, D Li, C Hooper, N Lee, S Yang, C Chou, B Zhu, L Zheng, ... Proceedings of Machine Learning and Systems 6, 296-311, 2024 | 28* | 2024 |
Efficiently programming large language models using sglang L Zheng, L Yin, Z Xie, J Huang, C Sun, CH Yu, S Cao, C Kozyrakis, ... arXiv preprint arXiv:2312.07104, 2023 | 20 | 2023 |
Towards Optimal Caching and Model Selection for Large Model Inference B Zhu, Y Sheng, L Zheng, C Barrett, M Jordan, J Jiao Advances in Neural Information Processing Systems 36, 2024 | 11* | 2024 |
Politeness for the theory of algebraic datatypes Y Sheng, Y Zohar, C Ringeissen, J Lange, P Fontaine, C Barrett International Joint Conference on Automated Reasoning, 238-255, 2020 | 10* | 2020 |
On the approximation of Nash equilibria in sparse win-lose games Z Liu, Y Sheng Proceedings of the AAAI Conference on Artificial Intelligence 32 (1), 2018 | 9 | 2018 |
Clover: Closed-Loop Verifiable Code Generation C Sun, Y Sheng, O Padon, C Barrett arXiv preprint arXiv:2310.17807, 2023 | 8 | 2023 |
Fairness in serving large language models Y Sheng, S Cao, D Li, B Zhu, Z Li, D Zhuo, JE Gonzalez, I Stoica arXiv preprint arXiv:2401.00588, 2023 | 7 | 2023 |
Reasoning about vectors using an SMT theory of sequences Y Sheng, A Nötzli, A Reynolds, Y Zohar, D Dill, W Grieskamp, J Park, ... International Joint Conference on Automated Reasoning, 125-143, 2022 | 7* | 2022 |