InferLine: latency-aware provisioning and scaling for prediction serving pipelines D Crankshaw, GE Sela, X Mo, C Zumar, I Stoica, J Gonzalez, A Tumanov Proceedings of the 11th ACM Symposium on Cloud Computing, 477-491, 2020 | 120 | 2020 |
Towards scalable dataframe systems D Petersohn, S Macke, D Xin, W Ma, D Lee, X Mo, JE Gonzalez, ... arXiv preprint arXiv:2001.00888, 2020 | 112 | 2020 |
Dynamic space-time scheduling for gpu inference P Jain, X Mo, A Jain, H Subbaraj, RS Durrani, A Tumanov, J Gonzalez, ... arXiv preprint arXiv:1901.00041, 1-8, 2018 | 86 | 2018 |
Inferline: Ml inference pipeline composition framework D Crankshaw, GE Sela, C Zumar, X Mo, JE Gonzalez, I Stoica, ... arXiv preprint arXiv:1812.01776, 2018 | 38 | 2018 |
The ooo vliw jit compiler for gpu inference P Jain, X Mo, A Jain, A Tumanov, JE Gonzalez, I Stoica arXiv preprint arXiv:1901.10008, 2019 | 17 | 2019 |
Optimizing llm queries in relational workloads S Liu, A Biswal, A Cheng, X Mo, S Cao, JE Gonzalez, I Stoica, M Zaharia arXiv preprint arXiv:2403.05821, 2024 | 7 | 2024 |
Context-aware streaming perception in dynamic environments GE Sela, I Gog, J Wong, KK Agrawal, X Mo, S Kalra, P Schafhalter, ... European Conference on Computer Vision, 621-638, 2022 | 7 | 2022 |
Pay attention to convolution filters: towards fast and accurate fine-grained transfer learning X Mo, R Cheng, T Fang arXiv preprint arXiv:1906.04950, 2019 | 5 | 2019 |
Cloudcast:{High-Throughput},{Cost-Aware} Overlay Multicast in the Cloud S Wooders, S Liu, P Jain, X Mo, JE Gonzalez, V Liu, I Stoica 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2024 | 3 | 2024 |
InferLine: ML prediction pipeline provisioning and management for tight latency objectives D Crankshaw, GE Sela, C Zumar, X Mo, JE Gonzalez, I Stoica, ... arXiv preprint arXiv:1812.01776, 2018 | 2 | 2018 |
RALF: Accuracy-Aware Scheduling for Feature Store Maintenance S Wooders, X Mo, A Narang, K Lin, I Stoica, JM Hellerstein, N Crooks, ... Proceedings of the VLDB Endowment 17 (3), 563-576, 2023 | 1 | 2023 |
Optimizing Speculative Decoding for Serving Large Language Models Using Goodput X Liu, C Daniel, L Hu, W Kwon, Z Li, X Mo, A Cheung, Z Deng, I Stoica, ... arXiv preprint arXiv:2406.14066, 2024 | | 2024 |