Nimble page management for tiered memory systems

Z Yan, D Lustig, D Nellans… - Proceedings of the Twenty …, 2019 - dl.acm.org
Software-controlled heterogeneous memory systems have the potential to increase the
performance and cost efficiency of computing systems. However they can only deliver on …

Softsku: Optimizing server architectures for microservice diversity@ scale

A Sriraman, A Dhanotia, TF Wenisch - Proceedings of the 46th …, 2019 - dl.acm.org
The variety and complexity of microservices in warehouse-scale data centers has grown
precipitously over the last few years to support a growing user base and an evolving product …

Mosaic: a GPU memory manager with application-transparent support for multiple page sizes

R Ausavarungnirun, J Landgraf, V Miller… - Proceedings of the 50th …, 2017 - dl.acm.org
Contemporary discrete GPUs support rich memory management features such as virtual
memory and demand paging. These features simplify GPU programming by providing a …

Efficient address translation for architectures with multiple page sizes

G Cox, A Bhattacharjee - ACM SIGPLAN Notices, 2017 - dl.acm.org
Processors and operating systems (OSes) support multiple memory page sizes. Superpages
increase Translation Lookaside Buffer (TLB) hits, while small pages provide fine-grained …

A survey of techniques for architecting TLBs

S Mittal - Concurrency and computation: practice and …, 2017 - Wiley Online Library
Translation lookaside buffer (TLB) caches virtual to physical address translation information
and is used in systems ranging from embedded devices to high‐end servers. Because TLB …

Large pages and lightweight memory management in virtualized environments: Can you have it both ways?

B Pham, J Veselý, GH Loh… - Proceedings of the 48th …, 2015 - dl.acm.org
Large pages have long been used to mitigate address translation overheads on big-memory
systems, particularly in virtualized environments where TLB miss overheads are severe. We …

Learning-based memory allocation for C++ server workloads

M Maas, DG Andersen, M Isard… - Proceedings of the …, 2020 - dl.acm.org
Modern C++ servers have memory footprints that vary widely over time, causing persistent
heap fragmentation of up to 2x from long-lived objects allocated during peak memory usage …

Prefetched address translation

A Margaritov, D Ustiugov, E Bugnion… - Proceedings of the 52nd …, 2019 - dl.acm.org
With explosive growth in dataset sizes and increasing machine memory capacities, per-
application memory footprints are commonly reaching into hundreds of GBs. Such huge …

A framework for memory oversubscription management in graphics processing units

C Li, R Ausavarungnirun, CJ Rossbach… - Proceedings of the …, 2019 - dl.acm.org
Modern discrete GPUs support unified memory and demand paging. Automatic
management of data movement between CPU memory and GPU memory dramatically …

Mask: Redesigning the gpu memory hierarchy to support multi-application concurrency

R Ausavarungnirun, V Miller, J Landgraf… - ACM SIGPLAN …, 2018 - dl.acm.org
Graphics Processing Units (GPUs) exploit large amounts of threadlevel parallelism to
provide high instruction throughput and to efficiently hide long-latency stalls. The resulting …