Mask: Redesigning the gpu memory hierarchy to support multi-application concurrency

R Ausavarungnirun, V Miller, J Landgraf… - ACM SIGPLAN …, 2018 - dl.acm.org
Graphics Processing Units (GPUs) exploit large amounts of threadlevel parallelism to
provide high instruction throughput and to efficiently hide long-latency stalls. The resulting …

Dynamic resource management for efficient utilization of multitasking GPUs

JJK Park, Y Park, S Mahlke - Proceedings of the twenty-second …, 2017 - dl.acm.org
As graphics processing units (GPUs) are broadly adopted, running multiple applications on
a GPU at the same time is beginning to attract wide attention. Recent proposals on …

{G-NET}: Effective {GPU} Sharing in {NFV} Systems

K Zhang, B He, J Hu, Z Wang, B Hua, J Meng… - … USENIX Symposium on …, 2018 - usenix.org
Network Function Virtualization (NFV) virtualizes software network functions to offer flexibility
in their design, management and deployment. Although GPUs have demonstrated their …

Constructing and characterizing covert channels on gpgpus

H Naghibijouybari, KN Khasawneh… - Proceedings of the 50th …, 2017 - dl.acm.org
General Purpose Graphics Processing Units (GPGPUs) are present in most modern
computing platforms. They are also increasingly integrated as a computational resource on …

Convolutional neural network with element-wise filters to extract hierarchical topological features for brain networks

X Xing, J Ji, Y Yao - 2018 IEEE international conference on …, 2018 - ieeexplore.ieee.org
Human brain network analysis based on machine learning has been paid much attention in
the field of neuroimaging, where the application of convolutional neural network (CNN) is …

Quality of service support for fine-grained sharing on GPUs

Z Wang, J Yang, R Melhem, B Childers… - Proceedings of the 44th …, 2017 - dl.acm.org
GPUs have been widely adopted in data centers to provide acceleration services to many
applications. Sharing a GPU is increasingly important for better processing throughput and …

KRISP: Enabling kernel-wise right-sizing for spatial partitioned gpu inference servers

M Chow, A Jahanshahi, D Wong - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
Machine learning (ML) inference workloads present significantly different challenges than
ML training workloads. Typically, inference workloads are shorter running and under-utilize …

Network-on-chip microarchitecture-based covert channel in gpus

J Ahn, J Kim, H Kasan, L Delshadtehrani… - MICRO-54: 54th Annual …, 2021 - dl.acm.org
As GPUs are becoming widely deployed in the cloud infrastructure to support different
application domains, the security concerns of GPUs are becoming increasingly important. In …

Virtual thread: Maximizing thread-level parallelism beyond GPU scheduling limit

MK Yoon, K Kim, S Lee, WW Ro… - ACM SIGARCH Computer …, 2016 - dl.acm.org
Modern GPUs require tens of thousands of concurrent threads to fully utilize the massive
amount of processing resources. However, thread concurrency in GPUs can be diminished …

Morpheus: Extending the last level cache capacity in GPU systems using idle GPU core resources

S Darabi, M Sadrosadati, N Akbarzadeh… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org
Graphics Processing Units (GPUs) are widely-used accelerators for data-parallel
applications. In many GPU applications, GPU memory bandwidth bottlenecks performance …