Memory Is All You Need: An Overview of Compute-in-Memory Architectures for Accelerating Large Language Model Inference

C Wolters, X Yang, U Schlichtmann… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) have recently transformed natural language processing,
enabling machines to generate human-like text and engage in meaningful conversations …

SwiftRL: Towards Efficient Reinforcement Learning on Real Processing-In-Memory Systems

K Gogineni, SS Dayapule, J Gómez-Luna… - arXiv preprint arXiv …, 2024 - arxiv.org
Reinforcement Learning (RL) trains agents to learn optimal behavior by maximizing reward
signals from experience datasets. However, RL training often faces memory limitations …

Analysis of Distributed Optimization Algorithms on a Real Processing-In-Memory System

S Rhyner, H Luo, J Gómez-Luna, M Sadrosadati… - arXiv preprint arXiv …, 2024 - arxiv.org
Machine Learning (ML) training on large-scale datasets is a very expensive and time-
consuming workload. Processor-centric architectures (eg, CPU, GPU) commonly used for …

High-Performance Process-in-Memory Architectures Design and Security Analysis

Z Wang - 2024 - deepblue.lib.umich.edu
The performance of processor-centric von Neumann architectures is greatly hindered by
data movement between memory and processor, especially when encountering data …