Llama: A heterogeneous & serverless framework for auto-tuning video analytics pipelines

F Romero, M Zhao, NJ Yadwadkar… - Proceedings of the ACM …, 2021 - dl.acm.org
The proliferation of camera-enabled devices and large video repositories has led to a
diverse set of video analytics applications. These applications rely on video pipelines …

Jiffy: Elastic far-memory for stateful serverless analytics

A Khandelwal, Y Tang, R Agarwal, A Akella… - Proceedings of the …, 2022 - dl.acm.org
Stateful serverless analytics can be enabled using a remote memory system for inter-task
communication, and for storing and exchanging intermediate data. However, existing …

Characterizing and synthesizing task dependencies of data-parallel jobs in alibaba cloud

H Tian, Y Zheng, W Wang - Proceedings of the ACM Symposium on …, 2019 - dl.acm.org
Cluster schedulers routinely face data-parallel jobs with complex task dependencies
expressed as DAGs (directed acyclic graphs). Understanding DAG structures and runtime …

Better Together: Jointly Optimizing {ML} Collective Scheduling and Execution Planning using {SYNDICATE}

K Mahajan, CH Chu, S Sridharan, A Akella - 20th USENIX Symposium …, 2023 - usenix.org
Emerging ML training deployments are trending towards larger models, and hybrid-parallel
training that is not just dominated by compute-intensive all-reduce for gradient aggregation …

Adaptive HTAP through elastic resource scheduling

A Raza, P Chrysogelos, AC Anadiotis… - Proceedings of the 2020 …, 2020 - dl.acm.org
Modern Hybrid Transactional/Analytical Processing (HTAP) systems use an integrated data
processing engine that performs analytics on fresh data, which are ingested from a …

WASP: Wide-area adaptive stream processing

A Jonathan, A Chandra, J Weissman - Proceedings of the 21st …, 2020 - dl.acm.org
Adaptability is critical for stream processing systems to ensure stable, low-latency, and high-
throughput processing of long-running queries. Such adaptability is particularly challenging …

Sol: Fast distributed computation over slow networks

F Lai, J You, X Zhu, HV Madhyastha… - … USENIX Symposium on …, 2020 - usenix.org
The popularity of big data and AI has led to many optimizations at different layers of
distributed computation stacks. Despite–or perhaps, because of–its role as the narrow waist …

Coded elastic computing

Y Yang, M Interlandi, P Grover, S Kar… - 2019 IEEE …, 2019 - ieeexplore.ieee.org
Cloud providers have recently introduced new offerings whereby spare computing
resources are accessible at discounts compared to on-demand computing. Exploiting such …

Accelerating deep learning inference via learned caches

A Balasubramanian, A Kumar, Y Liu, H Cao… - arXiv preprint arXiv …, 2021 - arxiv.org
Deep Neural Networks (DNNs) are witnessing increased adoption in multiple domains
owing to their high accuracy in solving real-world problems. However, this high accuracy …

Unlocking unallocated cloud capacity for long, uninterruptible workloads

A Agarwal, S Noghabi, Í Goiri, S Seshan… - 20th USENIX Symposium …, 2023 - usenix.org
Cloud providers auction off unallocated resources at a low cost to avoid keeping hardware
idle. One such mechanism is Harvest VMs (HVMs). These VMs grow and shrink as the …