Dosa: Differentiable model-based one-loop search for dnn accelerators

C Hong, Q Huang, G Dinh, M Subedar… - Proceedings of the 56th …, 2023 - dl.acm.org
In the hardware design space exploration process, it is critical to optimize both hardware
parameters and algorithm-to-hardware mappings. Previous work has largely approached …

The dawn of ai-native eda: Promises and challenges of large circuit models

L Chen, Y Chen, Z Chu, W Fang, TY Ho… - arXiv preprint arXiv …, 2024 - arxiv.org
Within the Electronic Design Automation (EDA) domain, AI-driven solutions have emerged
as formidable tools, yet they typically augment rather than redefine existing methodologies …

Workload-Aware Hardware Accelerator Mining for Distributed Deep Learning Training

M Adnan, A Phanishayee, J Kulkarni, PJ Nair… - arXiv preprint arXiv …, 2024 - arxiv.org
In this paper, we present a novel technique to search for hardware architectures of
accelerators optimized for end-to-end training of deep neural networks (DNNs). Our …

Integrated Hardware Architecture and Device Placement Search

I Wang, J Tarnawski, A Phanishayee… - arXiv preprint arXiv …, 2024 - arxiv.org
Distributed execution of deep learning training involves a dynamic interplay between
hardware accelerator architecture and device placement strategy. This is the first work to …

Theseus: Towards High-Efficiency Wafer-Scale Chip Design Space Exploration for Large Language Models

J Zhu, C Xue, Y Chen, Z Wang, G Sun - arXiv preprint arXiv:2407.02079, 2024 - arxiv.org
The emergence of the large language model~(LLM) poses an exponential growth of
demand for computation throughput, memory capacity, and communication bandwidth. Such …

TensorMap: A Deep RL-Based Tensor Mapping Framework for Spatial Accelerators

F Wang, M Shen, Y Lu, N Xiao - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
The mapping of tensor computation is a complex and important process for spatial
accelerators. Today's mapping works depend on hand-tuned kernel libraries or search …

Sample-Efficient Mapspace Optimization for DNN Accelerators with Bayesian Learning

G Dinh, IKJ Valsala, H Luo, C Hong, Y Cho… - … and System Support …, 2023 - openreview.net
Achieving high performance for machine learning domain-specific accelerators requires the
careful choice of a mapping from an algorithm to an accelerator. Most algorithms for finding …

[PDF][PDF] Optimization-Based Mappers and Lower Bounds for Tensor Problems

G Dinh - 2023 - digitalassets.lib.berkeley.edu
Tensor operations play a central role in many applications, such as machine learning, signal
processing, and dense linear algebra. These tensor operations are increasing significantly …

[PDF][PDF] Mind the Gap: Attainable Data Movement and Operational Intensity Bounds for Tensor Algorithms

Q Huang, PA Tsai, JS Emer, A Parashar - people.csail.mit.edu
The architectural design-space exploration (or DSE) process—whether manual or
automated—benefits greatly from knowing the limits of the metrics of interest in advance …

[PDF][PDF] Incorporating Prior Knowledge to Efficiently Design Deep Learning Accelerators

D Chiou, M Erez, A Parashar - cs.utexas.edu
The path to a PhD is mired with taxing twists and turns, and, more than the research, it was
the people around me that kept me going. I owe deep thanks to many, but in this brief …