关注
Nitin Kedia
Nitin Kedia
Research Fellow, Microsoft Research India
在 microsoft.com 的电子邮件经过验证
标题
引用次数
引用次数
年份
Taming throughput-latency tradeoff in llm inference with sarathi-serve
A Agrawal, N Kedia, A Panwar, J Mohan, N Kwatra, BS Gulavani, ...
arXiv preprint arXiv:2403.02310, 2024
342024
Vidur: A Large-Scale Simulation Framework For LLM Inference
A Agrawal, N Kedia, J Mohan, A Panwar, N Kwatra, B Gulavani, ...
Proceedings of Machine Learning and Systems 6, 351-366, 2024
72024
Metron: Holistic performance evaluation framework for llm inference systems
A Agrawal, A Agarwal, N Kedia, J Mohan, S Kundu, N Kwatra, R Ramjee, ...
arXiv preprint arXiv:2407.07000, 2024
32024
系统目前无法执行此操作,请稍后再试。
文章 1–3