Y Wu, Z Wang, WD Lu - arXiv preprint arXiv:2310.09385, 2023 - arxiv.org
Decoder-only Transformer models such as GPT have demonstrated superior performance in text generation, by autoregressively predicting the next token. However, the performance of …
C Giannoula, P Yang, IF Vega, J Yang, YX Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Graph Neural Networks (GNNs) are emerging ML models to analyze graph-structure data. Graph Neural Network (GNN) execution involves both compute-intensive and memory …
Data movement between memory and processors is a major bottleneck in modern computing systems. The processing-in-memory (PIM) paradigm aims to alleviate this …
Computing on encrypted data is a promising approach to reduce data security and privacy risks, with homomorphic encryption serving as a facilitator in achieving this goal. In this work …
Machine learning (ML) algorithms [1]–[6] have become ubiquitous in many fields of science and technology due to their ability to learn from and improve with experience with minimal …
O Mutlu - arXiv preprint arXiv:2305.20000, 2023 - arxiv.org
Memory-centric computing aims to enable computation capability in and near all places where data is generated and stored. As such, it can greatly reduce the large negative …
Reinforcement Learning (RL) trains agents to learn optimal behavior by maximizing reward signals from experience datasets. However, RL training often faces memory limitations …
Machine Learning (ML) training on large-scale datasets is a very expensive and time- consuming workload. Processor-centric architectures (eg, CPU, GPU) commonly used for …
Many companies rely on APIs of managed AI models such as OpenAI's GPT-4 to create AI- enabled experiences in their products. Along with the benefits of ease of use and shortened …