The programmable data plane: Abstractions, architectures, algorithms, and applications

O Michel, R Bifulco, G Retvari, S Schmid - ACM Computing Surveys …, 2021 - dl.acm.org
Programmable data plane technologies enable the systematic reconfiguration of the low-
level processing steps applied to network packets and are key drivers toward realizing the …

Direct access,{High-Performance} memory disaggregation with {DirectCXL}

D Gouk, S Lee, M Kwon, M Jung - 2022 USENIX Annual Technical …, 2022 - usenix.org
New cache coherent interconnects such as CXL have recently attracted great attention
thanks to their excellent hardware heterogeneity management and resource disaggregation …

A unified architecture for accelerating distributed {DNN} training in heterogeneous {GPU/CPU} clusters

Y Jiang, Y Zhu, C Lan, B Yi, Y Cui, C Guo - 14th USENIX Symposium on …, 2020 - usenix.org
Data center clusters that run DNN training jobs are inherently heterogeneous. They have
GPUs and CPUs for computation and network bandwidth for distributed training. However …

{AIFM}:{High-Performance},{Application-Integrated} far memory

Z Ruan, M Schwarzkopf, MK Aguilera… - 14th USENIX Symposium …, 2020 - usenix.org
Memory is the most contended and least elastic resource in datacenter servers today.
Applications can use only local memory—which may be scarce—even though memory …

Netcache: Balancing key-value stores with fast in-network caching

X Jin, X Li, H Zhang, R Soulé, J Lee, N Foster… - Proceedings of the 26th …, 2017 - dl.acm.org
We present NetCache, a new key-value store architecture that leverages the power and
flexibility of new-generation programmable switches to handle queries on hot items and …

Clio: A hardware-software co-designed disaggregated memory system

Z Guo, Y Shan, X Luo, Y Huang, Y Zhang - Proceedings of the 27th ACM …, 2022 - dl.acm.org
Memory disaggregation has attracted great attention recently because of its benefits in
efficient memory utilization and ease of management. So far, memory disaggregation …

Shenango: Achieving high {CPU} efficiency for latency-sensitive datacenter workloads

A Ousterhout, J Fried, J Behrens, A Belay… - … USENIX Symposium on …, 2019 - usenix.org
Datacenter applications demand microsecond-scale tail latencies and high request rates
from operating systems, and most applications handle loads that have high variance over …

Efficient memory disaggregation with infiniswap

J Gu, Y Lee, Y Zhang, M Chowdhury… - 14th USENIX Symposium …, 2017 - usenix.org
Memory-intensive applications suffer large performance loss when their working sets do not
fully fit in memory. Yet, they cannot leverage otherwise unused remote memory when paging …

Datacenter {RPCs} can be general and fast

A Kalia, M Kaminsky, D Andersen - 16th USENIX Symposium on …, 2019 - usenix.org
It is commonly believed that datacenter networking software must sacrifice generality to
attain high performance. The popularity of specialized distributed systems designed …

Offloading distributed applications onto smartnics using ipipe

M Liu, T Cui, H Schuh, A Krishnamurthy… - Proceedings of the …, 2019 - dl.acm.org
Emerging Multicore SoC SmartNICs, enclosing rich computing resources (eg, a multicore
processor, onboard DRAM, accelerators, programmable DMA engines), hold the potential to …