Serverless computing: state-of-the-art, challenges and opportunities

Y Li, Y Lin, Y Wang, K Ye, C Xu - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Serverless computing is growing in popularity by virtue of its lightweight and simplicity of
management. It achieves these merits by reducing the granularity of the computing unit to …

A unified architecture for accelerating distributed {DNN} training in heterogeneous {GPU/CPU} clusters

Y Jiang, Y Zhu, C Lan, B Yi, Y Cui, C Guo - 14th USENIX Symposium on …, 2020 - usenix.org
Data center clusters that run DNN training jobs are inherently heterogeneous. They have
GPUs and CPUs for computation and network bandwidth for distributed training. However …

Datacenter {RPCs} can be general and fast

A Kalia, M Kaminsky, D Andersen - 16th USENIX Symposium on …, 2019 - usenix.org
It is commonly believed that datacenter networking software must sacrifice generality to
attain high performance. The popularity of specialized distributed systems designed …

The demikernel datapath os architecture for microsecond-scale datacenter systems

I Zhang, A Raybuck, P Patel, K Olynyk… - Proceedings of the …, 2021 - dl.acm.org
Datacenter systems and I/O devices now run at single-digit microsecond latencies, requiring
ns-scale operating systems. Traditional kernel-based operating systems impose an …

Flatstore: An efficient log-structured key-value storage engine for persistent memory

Y Chen, Y Lu, F Yang, Q Wang, Y Wang… - Proceedings of the Twenty …, 2020 - dl.acm.org
Emerging hardware like persistent memory (PM) and high-speed NICs are promising to
build efficient key-value stores. However, we observe that the small-sized access pattern in …

Offloading distributed applications onto smartnics using ipipe

M Liu, T Cui, H Schuh, A Krishnamurthy… - Proceedings of the …, 2019 - dl.acm.org
Emerging Multicore SoC SmartNICs, enclosing rich computing resources (eg, a multicore
processor, onboard DRAM, accelerators, programmable DMA engines), hold the potential to …

Disaggregating persistent memory and controlling them remotely: An exploration of passive disaggregated {Key-Value} stores

SY Tsai, Y Shan, Y Zhang - 2020 USENIX Annual Technical Conference …, 2020 - usenix.org
Many datacenters and clouds manage storage systems separately from computing services
for better manageability and resource utilization. These existing disaggregated storage …

Snap: A microkernel approach to host networking

M Marty, M de Kruijf, J Adriaens, C Alfeld… - Proceedings of the 27th …, 2019 - dl.acm.org
This paper presents our design and experience with a microkernel-inspired approach to
host networking called Snap. Snap is a userspace networking system that supports Google's …

Carbink:{Fault-Tolerant} Far Memory

Y Zhou, HMG Wassel, S Liu, J Gao, J Mickens… - … USENIX Symposium on …, 2022 - usenix.org
Far memory systems allow an application to transparently access local memory as well as
memory belonging to remote machines. Fault tolerance is a critical property of any practical …

Octopus+: An RDMA-Enabled Distributed Persistent Memory File System

B Zhu, Y Chen, Q Wang, Y Lu, J Shu - ACM Transactions on Storage …, 2021 - dl.acm.org
Non-volatile memory and remote direct memory access (RDMA) provide extremely high
performance in storage and network hardware. However, existing distributed file systems …