The nanopu: A nanosecond network stack for datacenters

S Ibanez, A Mallery, S Arslan, T Jepsen… - … on Operating Systems …, 2021 - usenix.org
We present the nanoPU, a new NIC-CPU co-design to accelerate an increasingly pervasive
class of datacenter applications: those that utilize many small Remote Procedure Calls …

Efficient scheduling policies for {Microsecond-Scale} tasks

S McClure, A Ousterhout, S Shenker… - … USENIX Symposium on …, 2022 - usenix.org
Datacenter operators today strive to support microsecond-latency applications while also
using their limited CPU resources as efficiently as possible. To achieve this, several recent …

Dagger: efficient and fast RPCs in cloud microservices with near-memory reconfigurable NICs

N Lazarev, S Xiang, N Adit, Z Zhang… - Proceedings of the 26th …, 2021 - dl.acm.org
The ongoing shift of cloud services from monolithic designs to mi-croservices creates high
demand for efficient and high performance datacenter networking stacks, optimized for fine …

Microsecond consensus for microsecond applications

MK Aguilera, N Ben-David, R Guerraoui… - … USENIX Symposium on …, 2020 - usenix.org
We consider the problem of making apps fault-tolerant through replication, when apps
operate at the microsecond scale, as in finance, embedded computing, and microservices …

RAMBDA: RDMA-driven Acceleration Framework for Memory-intensive µs-scale Datacenter Applications

Y Yuan, J Huang, Y Sun, T Wang… - … Symposium on High …, 2023 - ieeexplore.ieee.org
Responding to the" datacenter tax" and" killer microseconds" problems for memory-intensive
datacenter applications, diverse solutions including Smart NIC-based ones have been …

{RackSched}: A {Microsecond-Scale} scheduler for {Rack-Scale} computers

H Zhu, K Kaffes, Z Chen, Z Liu, C Kozyrakis… - … USENIX Symposium on …, 2020 - usenix.org
Low-latency online services have strict Service Level Objectives (SLOs) that require
datacenter systems to support high throughput at microsecond-scale tail latency. Dataplane …

Syrup: User-defined scheduling across the stack

K Kaffes, JT Humphries, D Mazières… - Proceedings of the ACM …, 2021 - dl.acm.org
Suboptimal scheduling decisions in operating systems, networking stacks, and application
runtimes are often responsible for poor application performance, including higher latency …

RSS++ load and state-aware receive side scaling

T Barbette, GP Katsikas, GQ Maguire Jr… - Proceedings of the 15th …, 2019 - dl.acm.org
While the current literature typically focuses on load-balancing among multiple servers, in
this paper, we demonstrate the importance of load-balancing within a single machine …

Cerebros: Evading the rpc tax in datacenters

A Pourhabibi, M Sutherland, A Daglis… - MICRO-54: 54th Annual …, 2021 - dl.acm.org
The emerging paradigm of microservices decomposes online services into fine-grained
software modules frequently communicating over the datacenter network, often using …

Achieving microsecond-scale tail latency efficiently with approximate optimal scheduling

R Iyer, M Unal, M Kogias, G Candea - Proceedings of the 29th …, 2023 - dl.acm.org
Datacenter applications expect microsecond-scale service times and tightly bound tail
latency, with future workloads expected to be even more demanding. To address this …