Caladan: Mitigating interference at microsecond timescales

J Fried, Z Ruan, A Ousterhout, A Belay - 14th USENIX Symposium on …, 2020 - usenix.org
The conventional wisdom is that CPU resources such as cores, caches, and memory
bandwidth must be partitioned to achieve performance isolation between tasks. Both the …

Few-to-many: Incremental parallelism for reducing tail latency in interactive services

ME Haque, YH Eom, Y He, S Elnikety, R Bianchini… - ACM Sigplan …, 2015 - dl.acm.org
Interactive services, such as Web search, recommendations, games, and finance, must
respond quickly to satisfy customers. Achieving this goal requires optimizing tail (eg, 99th+ …

Exploiting heterogeneity for tail latency and energy efficiency

ME Haque, Y He, S Elnikety, TD Nguyen… - Proceedings of the 50th …, 2017 - dl.acm.org
Interactive service providers have strict requirements on high-percentile (tail) latency to meet
user expectations. If providers meet tail latency targets with less energy, they increase …

CuttleSys: Data-driven resource management for interactive services on reconfigurable multicores

N Kulkarni, G Gonzalez-Pumariega… - 2020 53rd Annual …, 2020 - ieeexplore.ieee.org
Multi-tenancy for latency-critical applications leads to resource interference and
unpredictable performance. Core reconfiguration opens up more opportunities for …

Delayed-Dynamic-Selective (DDS) prediction for reducing extreme tail latency in web search

S Kim, Y He, S Hwang, S Elnikety, S Choi - Proceedings of the Eighth …, 2015 - dl.acm.org
A commercial web search engine shards its index among many servers, and therefore the
response time of a search query is dominated by the slowest server that processes the …

Work stealing for interactive services to meet target latency

J Li, K Agrawal, S Elnikety, Y He, ITA Lee, C Lu… - Proceedings of the 21st …, 2016 - dl.acm.org
Interactive web services increasingly drive critical business workloads such as search,
advertising, games, shopping, and finance. Whereas optimizing parallel programs and …

Elfen Scheduling:{Fine-Grain} Principled Borrowing from {Latency-Critical} Workloads Using Simultaneous Multithreading

X Yang, SM Blackburn, KS McKinley - 2016 USENIX Annual Technical …, 2016 - usenix.org
Web services from search to games to stock trading impose strict Service Level Objectives
(SLOs) on tail latency. Meeting these objectives is challenging because the computational …

Q-zilla: A scheduling framework and core microarchitecture for tail-tolerant microservices

A Mirhosseini, BL West, GW Blake… - … Symposium on High …, 2020 - ieeexplore.ieee.org
Managing tail latency is a primary challenge in designing large-scale Internet services.
Queuing is a major contributor to end-to-end tail latency, wherein nominal tasks are …

CASH: Supporting IaaS customers with a sub-core configurable architecture

Y Zhou, H Hoffmann, D Wentzlaff - ACM SIGARCH Computer …, 2016 - dl.acm.org
Infrastructure as a Service (IaaS) Clouds have grown increasingly important. Recent
architecture designs support IaaS providers through fine-grain configurability, allowing …

The {TURBO} diaries: Application-controlled frequency scaling explained

JT Wamhoff, S Diestelhorst, C Fetzer, P Marlier… - 2014 USENIX Annual …, 2014 - usenix.org
Most multi-core architectures nowadays support dynamic voltage and frequency scaling
(DVFS) to adapt their speed to the system's load and save energy. Some recent …