Clite: Efficient and qos-aware co-location of multiple latency-critical jobs for warehouse scale computers

T Patel, D Tiwari - 2020 IEEE International Symposium on High …, 2020 - ieeexplore.ieee.org
Large-scale data centers run latency-critical jobs with quality-of-service (QoS) requirements,
and throughput-oriented background jobs, which need to achieve high perfor-mance …

Make the most out of last level cache in intel processors

A Farshin, A Roozbeh, GQ Maguire Jr… - Proceedings of the …, 2019 - dl.acm.org
In modern (Intel) processors, Last Level Cache (LLC) is divided into multiple slices and an
undocumented hashing algorithm (aka Complex Addressing) maps different parts of …

Miso: exploiting multi-instance gpu capability on multi-tenant gpu clusters

B Li, T Patel, S Samsi, V Gadepally… - Proceedings of the 13th …, 2022 - dl.acm.org
GPU technology has been improving at an expedited pace in terms of size and performance,
empowering HPC and AI/ML researchers to advance the scientific discovery process …

Reexamining Direct Cache Access to Optimize {I/O} Intensive Applications for Multi-hundred-gigabit Networks

A Farshin, A Roozbeh, GQ Maguire Jr… - 2020 USENIX Annual …, 2020 - usenix.org
Memory access is the major bottleneck in realizing multi-hundred-gigabit networks with
commodity hardware, hence it is essential to make good use of cache memory that is a …

CoPart: Coordinated partitioning of last-level cache and memory bandwidth for fairness-aware workload consolidation on commodity servers

J Park, S Park, W Baek - … of the Fourteenth EuroSys Conference 2019, 2019 - dl.acm.org
Workload consolidation is a widely-used technique to maximize server resource utilization in
cloud and datacenter computing. Recent commodity CPUs support last-level cache (LLC) …

On the opportunities of green computing: A survey

Y Zhou, X Lin, X Zhang, M Wang, G Jiang, H Lu… - arXiv preprint arXiv …, 2023 - arxiv.org
Artificial Intelligence (AI) has achieved significant advancements in technology and research
with the development over several decades, and is widely used in many areas including …

Scavenger: A black-box batch workload resource manager for improving utilization in cloud environments

SA Javadi, A Suresh, M Wajahat, A Gandhi - Proceedings of the ACM …, 2019 - dl.acm.org
Resource under-utilization is common in cloud data centers. Prior works have proposed
improving utilization by running provider workloads in the background, colocated with tenant …

Quarantine: Mitigating transient execution attacks with physical domain isolation

M Hertogh, M Wiesinger, S Österlund… - Proceedings of the 26th …, 2023 - dl.acm.org
Since the Spectre and Meltdown disclosure in 2018, the list of new transient execution
vulnerabilities that abuse the shared nature of microarchitectural resources on CPU cores …

Servermore: Opportunistic execution of serverless functions in the cloud

A Suresh, A Gandhi - Proceedings of the ACM symposium on cloud …, 2021 - dl.acm.org
Serverless computing allows customers to submit their jobs to the cloud for execution, with
the resource provisioning being taken care of by the cloud provider. Serverless functions are …

Satori: efficient and fair resource partitioning by sacrificing short-term benefits for long-term gains

RB Roy, T Patel, D Tiwari - 2021 ACM/IEEE 48th Annual …, 2021 - ieeexplore.ieee.org
Multi-core architectures have enabled data centers to increasingly co-locate multiple jobs to
improve resource utilization and lower the operational cost. Unfortunately, naively co …