A reconfigurable fabric for accelerating large-scale datacenter services

A Shawahna, SM Sait, A El-Maleh - ieee Access, 2018 - ieeexplore.ieee.org

Due to recent advances in digital technologies, and availability of credible data, an area of
artificial intelligence, deep learning, has emerged and has demonstrated its ability and …

被引用次数：498 相关文章所有 9 个版本

[PDF] nsf.gov

The future of FPGA acceleration in datacenters and the cloud

C Bobda, JM Mbongue, P Chow, M Ewais… - ACM Transactions on …, 2022 - dl.acm.org

In this article, we survey existing academic and commercial efforts to provide Field-
Programmable Gate Array (FPGA) acceleration in datacenters and the cloud. The goal is a …

被引用次数：90 相关文章所有 6 个版本

[PDF] acm.org

An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems

Y Gan, Y Zhang, D Cheng, A Shetty, P Rathi… - Proceedings of the …, 2019 - dl.acm.org

Cloud services have recently started undergoing a major shift from monolithic applications,
to graphs of hundreds or thousands of loosely-coupled microservices. Microservices …

被引用次数：622 相关文章所有 9 个版本

[PDF] ulisboa.pt

A configurable cloud-scale DNN processor for real-time AI

J Fowers, K Ovtcharov, M Papamichael… - 2018 ACM/IEEE 45th …, 2018 - ieeexplore.ieee.org

Interactive AI-powered services require low-latency evaluation of deep neural network
(DNN) models-aka"" real-time AI"". The growing demand for computationally expensive …

被引用次数：658 相关文章所有 12 个版本

[PDF] usenix.org

Azure accelerated networking:{SmartNICs} in the public cloud

D Firestone, A Putnam, S Mundkur, D Chiou… - … USENIX Symposium on …, 2018 - usenix.org

Modern cloud architectures rely on each server running its own networking stack to
implement policies such as tunneling for virtual networks, security, and load balancing …

被引用次数：683 相关文章所有 5 个版本

[PDF] usenix.org

{LegoOS}: A disseminated, distributed {OS} for hardware resource disaggregation

Y Shan, Y Huang, Y Chen, Y Zhang - 13th USENIX Symposium on …, 2018 - usenix.org

The monolithic server model where a server is the unit of deployment, operation, and failure
is meeting its limits in the face of several recent hardware and application trends. To improve …

被引用次数：394 相关文章所有 22 个版本

[PDF] utah.edu

ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars

A Shafiee, A Nag, N Muralimanohar… - ACM SIGARCH …, 2016 - dl.acm.org

A number of recent efforts have attempted to design accelerators for popular machine
learning algorithms, such as those involving convolutional and deep neural networks (CNNs …

被引用次数：2064 相关文章所有 14 个版本

[PDF] utoronto.ca

VTR 8: High-performance CAD and customizable FPGA architecture modelling

KE Murray, O Petelin, S Zhong, JM Wang… - ACM Transactions on …, 2020 - dl.acm.org

Developing Field-programmable Gate Array (FPGA) architectures is challenging due to the
competing requirements of various application domains and changing manufacturing …

被引用次数：243 相关文章所有 8 个版本

[PDF] arxiv.org

Tensor comprehensions: Framework-agnostic high-performance machine learning abstractions

N Vasilache, O Zinenko, T Theodoridis, P Goyal… - arXiv preprint arXiv …, 2018 - arxiv.org

Deep learning models with convolutional and recurrent networks are now ubiquitous and
analyze massive amounts of audio, image, video, text and graph data, with applications in …

被引用次数：475 相关文章所有 6 个版本

[PDF] arxiv.org

Nvidia tensor core programmability, performance & precision

S Markidis, SW Der Chien, E Laure… - 2018 IEEE …, 2018 - ieeexplore.ieee.org

The NVIDIA Volta GPU microarchitecture introduces a specialized unit, called Tensor Core
that performs one matrix-multiply-and-accumulate on 4x4 matrices per clock cycle. The …

被引用次数：448 相关文章所有 8 个版本

高级搜索

QQ 群