Clio: A hardware-software co-designed disaggregated memory system

Z Guo, Y Shan, X Luo, Y Huang, Y Zhang - Proceedings of the 27th ACM …, 2022 - dl.acm.org
Memory disaggregation has attracted great attention recently because of its benefits in
efficient memory utilization and ease of management. So far, memory disaggregation …

Cornflakes: Zero-copy serialization for microsecond-scale networking

D Raghavan, S Ravi, G Yuan, P Thaker… - Proceedings of the 29th …, 2023 - dl.acm.org
Data serialization is critical for many datacenter applications, but the memory copies
required to move application data into packets are costly. Recent zero-copy APIs expose …

Electrode: Accelerating Distributed Protocols with {eBPF}

Y Zhou, Z Wang, S Dharanipragada, M Yu - 20th USENIX Symposium …, 2023 - usenix.org
Implementing distributed protocols under a standard Linux kernel networking stack enjoys
the benefits of load-aware CPU scaling, high compatibility, and robust security and isolation …

An introduction to the compute express link (cxl) interconnect

DD Sharma, R Blankenship, DS Berger - arXiv preprint arXiv:2306.11227, 2023 - arxiv.org
The Compute Express Link (CXL) is an open industry-standard interconnect between
processors and devices such as accelerators, memory buffers, smart network interfaces …

Paella: Low-latency model serving with software-defined gpu scheduling

KKW Ng, HM Demoulin, V Liu - Proceedings of the 29th Symposium on …, 2023 - dl.acm.org
Model serving systems play a critical role in multiplexing machine learning inference jobs
across shared GPU infrastructure. These systems have traditionally sat at a high level of …

Achieving microsecond-scale tail latency efficiently with approximate optimal scheduling

R Iyer, M Unal, M Kogias, G Candea - Proceedings of the 29th …, 2023 - dl.acm.org
Datacenter applications expect microsecond-scale service times and tightly bound tail
latency, with future workloads expected to be even more demanding. To address this …

Making kernel bypass practical for the cloud with Junction

J Fried, GI Chaudhry, E Saurez, E Choukse… - … USENIX Symposium on …, 2024 - usenix.org
Kernel bypass systems have demonstrated order of magnitude improvements in throughput
and tail latency for network-intensive applications relative to traditional operating systems …

Towards μs tail latency and terabit ethernet: disaggregating the host network stack

Q Cai, M Vuppalapati, J Hwang, C Kozyrakis… - Proceedings of the …, 2022 - dl.acm.org
Dedicated, tightly integrated, and static packet processing pipelines in today's most widely
deployed network stacks preclude them from fully exploiting capabilities of modern …

Exploiting Cloud Object Storage for High-Performance Analytics

D Durner, V Leis, T Neumann - Proceedings of the VLDB Endowment, 2023 - dl.acm.org
Elasticity of compute and storage is crucial for analytical cloud database systems. All cloud
vendors provide disaggregated object stores, which can be used as storage backend for …

Remote procedure call as a managed system service

J Chen, Y Wu, S Lin, Y Xu, X Kong… - … USENIX Symposium on …, 2023 - usenix.org
Remote Procedure Call (RPC) is a widely used abstraction for cloud computing. The
programmer specifies type information for each remote procedure, and a compiler generates …