ROMANet: Fine-grained reuse-driven off-chip memory access management and data organization for deep neural network accelerators

RVW Putra, MA Hanif… - IEEE Transactions on Very …, 2021 - ieeexplore.ieee.org
… the ROMANet methodology that aims at reducing the number of memory accesses, … network
using a design space exploration, based on the knowledge of the available on-chip memory

[图书][B] Network Algorithmics: an interdisciplinary approach to designing fast networked devices

G Varghese, J Xu - 2022 - books.google.com
… However, the book goes further and distills a fundamental method of crafting solutions to
new network bottlenecks that we call network algorithmics. This provides the reader tools to …

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

X Zou, S Xu, X Chen, L Yan, Y Han - Science China Information Sciences, 2021 - Springer
… Although many researchers have presented various PIM methods, only a few PIM systems
… For instance, a circuit-level PIM based on SRAM array stores network weights in SRAM cells, …

Mapping techniques in multicore processors: current and future trends

M Gupta, L Bhargava, S Indu - The Journal of Supercomputing, 2021 - Springer
… which interact via a communication network. The diversity in cores … multicore processors
has been focussed on this methodology… Various thermal management methods—voltage and …

Subway: Minimizing data transfer during out-of-GPU-memory graph processing

AHN Sabet, Z Zhao, R Gupta - … of the Fifteenth European Conference on …, 2020 - dl.acm.org
… In summary, both the existing partitioning and unified memory-based methods load the
graph based on coarsegrained activeness tracking, fundamentally limiting their performance …

A multi-neural network acceleration architecture

E Baek, D Kwon, J Kim - 2020 ACM/IEEE 47th Annual …, 2020 - ieeexplore.ieee.org
methods: memory block prefetching and compute block merging for the best resource load
matching, and memory block eviction for the minimum on-chip memory … scheduling method. …

Efficient memory management for large language model serving with pagedattention

W Kwon, Z Li, S Zhuang, Y Sheng, L Zheng… - Proceedings of the 29th …, 2023 - dl.acm.org
… how this design facilitates effective memory management for various decoding methods (§4.4) …
Superneurons: Dynamic GPU memory management for training deep neural networks. In …

Memory pooling with cxl

D Gouk, M Kwon, H Bae, S Lee, M Jung - IEEE Micro, 2023 - ieeexplore.ieee.org
… Prototype Implementation Figure 5a illustrates our design of a CXL network topology to
disaggregate memory resources, and the corresponding implementation in a real system is …

Tpp: Transparent page placement for cxl-enabled tiered-memory

HA Maruf, H Wang, A Dhanotia, J Weiner… - Proceedings of the 28th …, 2023 - dl.acm.org
memory hierarchies and leads to stranded compute, network… The methodology to identify
the ideal fraction of such working … transparent memory management for memory bandwidth-…

Cache coherence protocols in distributed systems

H Shukur, S Zeebaree, R Zebari, O Ahmed… - Journal of Applied …, 2020 - jastt.org
memory devices is to maintain the cache coherently. In this paper, we presented a number
of methods … and which type of protocol used for network topology and provided the type of …