One Queue Is All You Need: Resolving Head-of-Line Blocking in Large Language Model Serving A Patke, D Reddy, S Jha, H Qiu, C Pinto, S Cui, C Narayanaswami, ... arXiv preprint arXiv:2407.00047, 2024 | | 2024 |
FLASH: Fast model adaptation in ML-centric cloud platforms H Qiu, W Mao, A Patke, S Cui, C Wang, H Franke, Z Kalbarczyk, T Basar, ... Proceedings of Machine Learning and Systems 6, 524-544, 2024 | 4 | 2024 |
Queue Management for Large Language Model Serving A Patke, D Reddy, S Jha, C Pinto, H Qiu, S Cui, C Narayanaswami, ... International Conference on Architectural Support for Programming Languages …, 2024 | | 2024 |
Efficient interactive LLM serving with proxy model-based sequence length prediction H Qiu, W Mao, A Patke, S Cui, S Jha, C Wang, H Franke, ZT Kalbarczyk, ... arXiv preprint arXiv:2404.08509, 2024 | 7 | 2024 |
Power-aware Deep Learning Model Serving with {μ-Serve} H Qiu, W Mao, A Patke, S Cui, S Jha, C Wang, H Franke, Z Kalbarczyk, ... 2024 USENIX Annual Technical Conference (USENIX ATC 24), 75-93, 2024 | 2 | 2024 |
Determining optimal data access for deep learning applications on a cluster S Venugopal, A Patke, I Gkoufas, C Pinto, P Koutsovasilis US Patent App. 17/305,735, 2023 | | 2023 |
SIMPPO: A scalable and incremental online learning framework for serverless resource management H Qiu, W Mao, A Patke, C Wang, H Franke, ZT Kalbarczyk, T Başar, ... Proceedings of the 13th Symposium on Cloud Computing, 306-322, 2022 | 17 | 2022 |
Evaluating hardware memory disaggregation under delay and contention A Patke, H Qiu, S Jha, S Venugopal, M Gazzetti, C Pinto, Z Kalbarczyk, ... 2022 IEEE International Parallel and Distributed Processing Symposium …, 2022 | 4 | 2022 |
Reinforcement learning for resource management in multi-tenant serverless platforms H Qiu, W Mao, A Patke, C Wang, H Franke, ZT Kalbarczyk, T Başar, ... Proceedings of the 2nd European Workshop on Machine Learning and Systems, 20-28, 2022 | 23 | 2022 |
Is function-as-a-service a good fit for latency-critical services? H Qiu, S Jha, SS Banerjee, A Patke, C Wang, F Hubertus, ZT Kalbarczyk, ... Proceedings of the Seventh International Workshop on Serverless Computing …, 2021 | 13 | 2021 |
Delay sensitivity-driven congestion mitigation for hpc systems A Patke, S Jha, H Qiu, J Brandt, A Gentile, J Greenseid, Z Kalbarczyk, ... Proceedings of the ACM International Conference on Supercomputing, 342-353, 2021 | 7* | 2021 |
Measuring congestion in high-performance datacenter interconnects S Jha, A Patke, J Brandt, A Gentile, B Lim, M Showerman, G Bauer, ... 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2020 | 28 | 2020 |
A study of network congestion in two supercomputing high-speed interconnects S Jha, A Patke, J Brandt, A Gentile, M Showerman, E Roman, ... 2019 IEEE Symposium on High-Performance Interconnects (HOTI), 45-48, 2019 | 12 | 2019 |
Antidepressant activity of Simvastatin in behavioral models of depression in rats A Patke, R Tripathi, VG Patke, D Sonawane, N Rege Int J Res Med Sci 3, 1666-1671, 2015 | 6 | 2015 |