Shinjuku: Preemptive Scheduling for {μsecond-scale} Tail Latency

K Kaffes, T Chong, JT Humphries, A Belay… - … USENIX Symposium on …, 2019 - usenix.org
The recently proposed dataplanes for microsecond scale applications, such as IX and
ZygOS, use non-preemptive policies to schedule requests to cores. For the many real-world …

Centralized core-granular scheduling for serverless functions

K Kaffes, NJ Yadwadkar, C Kozyrakis - … of the ACM symposium on cloud …, 2019 - dl.acm.org
In recent years, many applications have started using serverless computing platforms
primarily due to the ease of deployment and cost efficiency they offer. However, the existing …

Scalable persistent memory file system with {Kernel-Userspace} collaboration

Y Chen, Y Lu, B Zhu, AC Arpaci-Dusseau… - … USENIX Conference on …, 2021 - usenix.org
We introduce Kuco, a novel direct-access file system architecture whose main goal is
scalability. Kuco utilizes three key techniques–collaborative indexing, two-level locking, and …

Advanced synchronization techniques for task-based runtime systems

D Álvarez, K Sala, M Maroñas, A Roca… - Proceedings of the 26th …, 2021 - dl.acm.org
Task-based programming models like OmpSs-2 and OpenMP provide a flexible data-flow
execution model to exploit dynamic, irregular and nested parallelism. Providing an efficient …

TriCache: a user-transparent block cache enabling high-performance out-of-core processing with in-memory programs

G Feng, H Cao, X Zhu, B Yu, Y Wang, Z Ma… - ACM Transactions on …, 2023 - dl.acm.org
Out-of-core systems rely on high-performance cache sub-systems to reduce the number of
I/O operations. Although the page cache in modern operating systems enables transparent …

Wormhole: A fast ordered index for in-memory data management

X Wu, F Ni, S Jiang - Proceedings of the Fourteenth EuroSys Conference …, 2019 - dl.acm.org
In-memory data management systems, such as key-value stores, have become an essential
infrastructure in today's big-data processing and cloud computing. They rely on efficient …

Lock–unlock: Is that all? a pragmatic analysis of locking in software systems

R Guerraoui, H Guiroux, R Lachaize, V Quéma… - ACM Transactions on …, 2019 - dl.acm.org
A plethora of optimized mutex lock algorithms have been designed over the past 25 years to
mitigate performance bottlenecks related to critical sections and locks. Unfortunately, there is …

Mv-rlu: Scaling read-log-update with multi-versioning

J Kim, A Mathew, S Kashyap… - Proceedings of the …, 2019 - dl.acm.org
This paper presents multi-version read-log-update (MV-RLU), an extension of the read-log-
update (RLU) synchronization mechanism. While RLU has many merits including an …

Continuum: A platform for cost-aware, low-latency continual learning

H Tian, M Yu, W Wang - Proceedings of the ACM Symposium on Cloud …, 2018 - dl.acm.org
Many machine learning applications operate in dynamic environments that change over
time, in which models must be continually updated to capture the recent trend in data …

DRAMHiT: A Hash Table Architected for the Speed of DRAM

V Narayanan, D Detweiler, T Huang… - Proceedings of the …, 2023 - dl.acm.org
Despite decades of innovation, existing hash tables fail to achieve peak performance on
modern hardware. Built around a relatively simple computation, ie, a hash function, which in …