Cosa: Scheduling by constrained optimization for spatial accelerators

N Samardzic, A Feldmann, A Krastev… - Proceedings of the 49th …, 2022 - dl.acm.org

Fully Homomorphic Encryption (FHE) enables offloading computation to untrusted servers
with cryptographic privacy. Despite its attractive security, FHE is not yet widely adopted due …

被引用次数：164 相关文章所有 9 个版本

[PDF] arxiv.org

Full stack optimization of transformer inference: a survey

S Kim, C Hooper, T Wattanawong, M Kang… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent advances in state-of-the-art DNN architecture design have been moving toward
Transformer models. These models achieve superior accuracy across a wide range of …

被引用次数：90 相关文章所有 4 个版本

[PDF] acm.org

AMOS: enabling automatic mapping for tensor computations on spatial accelerators with hardware abstraction

S Zheng, R Chen, A Wei, Y Jin, Q Han, L Lu… - Proceedings of the 49th …, 2022 - dl.acm.org

Hardware specialization is a promising trend to sustain performance growth. Spatial
hardware accelerators that employ specialized and hierarchical computation and memory …

被引用次数：59 相关文章所有 3 个版本

[PDF] arxiv.org

Sparseloop: An analytical approach to sparse tensor accelerator modeling

YN Wu, PA Tsai, A Parashar, V Sze… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org

In recent years, many accelerators have been proposed to efficiently process sparse tensor
algebra applications (eg, sparse neural networks). However, these proposals are single …

被引用次数：60 相关文章所有 10 个版本

[PDF] acm.org

Tileflow: A framework for modeling fusion dataflow via tree-based analysis

S Zheng, S Chen, S Gao, L Jia, G Sun… - Proceedings of the 56th …, 2023 - dl.acm.org

With the increasing size of DNN models and the growing discrepancy between compute
performance and memory bandwidth, fusing multiple layers together to reduce off-chip …

被引用次数：14 相关文章所有 5 个版本

[PDF] acm.org

Inter-layer scheduling space definition and exploration for tiled accelerators

J Cai, Y Wei, Z Wu, S Peng, K Ma - Proceedings of the 50th Annual …, 2023 - dl.acm.org

With the continuous expansion of the DNN accelerator scale, inter-layer scheduling, which
studies the allocation of computing resources to each layer and the computing order of all …

被引用次数：25 相关文章所有 2 个版本

[PDF] google.com

Chimera: An analytical optimizing framework for effective compute-intensive operators fusion

S Zheng, S Chen, P Song, R Chen, X Li… - … Symposium on High …, 2023 - ieeexplore.ieee.org

Machine learning models with various tensor operators are becoming ubiquitous in recent
years. There are two types of operators in machine learning: compute-intensive operators …

被引用次数：24 相关文章所有 4 个版本

[PDF] acm.org

Dosa: Differentiable model-based one-loop search for dnn accelerators

C Hong, Q Huang, G Dinh, M Subedar… - Proceedings of the 56th …, 2023 - dl.acm.org

In the hardware design space exploration process, it is critical to optimize both hardware
parameters and algorithm-to-hardware mappings. Previous work has largely approached …

被引用次数：15 相关文章所有 5 个版本

[PDF] arxiv.org

Magma: An optimization framework for mapping multiple dnns on multiple accelerator cores

SC Kao, T Krishna - 2022 IEEE International Symposium on …, 2022 - ieeexplore.ieee.org

As Deep Learning continues to drive a variety of applications in edge and cloud data
centers, there is a growing trend towards building large accelerators with several sub …

被引用次数：49 相关文章所有 5 个版本

[PDF] arxiv.org

Teaal: A declarative framework for modeling sparse tensor accelerators

N Nayak, TO Odemuyiwa, S Ugare, C Fletcher… - Proceedings of the 56th …, 2023 - dl.acm.org

Over the past few years, the explosion in sparse tensor algebra workloads has led to a
corresponding rise in domain-specific accelerators to service them. Due to the irregularity …

被引用次数：16 相关文章所有 8 个版本

高级搜索

QQ 群