{AIFM}:{High-Performance},{Application-Integrated} far memory

Z Ruan, M Schwarzkopf, MK Aguilera… - 14th USENIX Symposium …, 2020 - usenix.org
Memory is the most contended and least elastic resource in datacenter servers today.
Applications can use only local memory—which may be scarce—even though memory …

Nightcore: efficient and scalable serverless computing for latency-sensitive, interactive microservices

Z Jia, E Witchel - Proceedings of the 26th ACM International Conference …, 2021 - dl.acm.org
The microservice architecture is a popular software engineering approach for building
flexible, large-scale online services. Serverless functions, or function as a service (FaaS) …

Clio: A hardware-software co-designed disaggregated memory system

Z Guo, Y Shan, X Luo, Y Huang, Y Zhang - Proceedings of the 27th ACM …, 2022 - dl.acm.org
Memory disaggregation has attracted great attention recently because of its benefits in
efficient memory utilization and ease of management. So far, memory disaggregation …

Shenango: Achieving high {CPU} efficiency for latency-sensitive datacenter workloads

A Ousterhout, J Fried, J Behrens, A Belay… - … USENIX Symposium on …, 2019 - usenix.org
Datacenter applications demand microsecond-scale tail latencies and high request rates
from operating systems, and most applications handle loads that have high variance over …

The demikernel datapath os architecture for microsecond-scale datacenter systems

I Zhang, A Raybuck, P Patel, K Olynyk… - Proceedings of the …, 2021 - dl.acm.org
Datacenter systems and I/O devices now run at single-digit microsecond latencies, requiring
ns-scale operating systems. Traditional kernel-based operating systems impose an …

Snap: A microkernel approach to host networking

M Marty, M de Kruijf, J Adriaens, C Alfeld… - Proceedings of the 27th …, 2019 - dl.acm.org
This paper presents our design and experience with a microkernel-inspired approach to
host networking called Snap. Snap is a userspace networking system that supports Google's …

{ATP}: In-network aggregation for multi-tenant learning

CL Lao, Y Le, K Mahajan, Y Chen, W Wu… - … USENIX Symposium on …, 2021 - usenix.org
Distributed deep neural network training (DT) systems are widely deployed in clusters where
the network is shared across multiple tenants, ie, multiple DT jobs. Each DT job computes …

Flatstore: An efficient log-structured key-value storage engine for persistent memory

Y Chen, Y Lu, F Yang, Q Wang, Y Wang… - Proceedings of the Twenty …, 2020 - dl.acm.org
Emerging hardware like persistent memory (PM) and high-speed NICs are promising to
build efficient key-value stores. However, we observe that the small-sized access pattern in …

{SRNIC}: A scalable architecture for {RDMA}{NICs}

Z Wang, L Luo, Q Ning, C Zeng, W Li, X Wan… - … USENIX Symposium on …, 2023 - usenix.org
RDMA is expected to be highly scalable: to perform well in large-scale data center networks
where packet losses are inevitable (ie, high network scalability), and to support a large …

Octopus+: An RDMA-Enabled Distributed Persistent Memory File System

B Zhu, Y Chen, Q Wang, Y Lu, J Shu - ACM Transactions on Storage …, 2021 - dl.acm.org
Non-volatile memory and remote direct memory access (RDMA) provide extremely high
performance in storage and network hardware. However, existing distributed file systems …