T Sipola, J Alatalo, T Kokkonen… - 2022 31st Conference …, 2022 - ieeexplore.ieee.org
The modern trend of moving artificial intelligence computation near to the origin of data sources has increased the demand for new hardware and software suitable for such …
YE Wang, GY Wei, D Brooks - arXiv preprint arXiv:1907.10701, 2019 - arxiv.org
Training deep learning models is compute-intensive and there is an industry-wide trend towards hardware specialization to improve performance. To systematically benchmark …
Deep learning models with convolutional and recurrent networks are now ubiquitous and analyze massive amounts of audio, image, video, text and graph data, with applications in …
Z Jia, O Padon, J Thomas, T Warszawski… - Proceedings of the 27th …, 2019 - dl.acm.org
Existing deep neural network (DNN) frameworks optimize the computation graph of a DNN by applying graph transformations manually designed by human experts. This approach …
R Wang, P Chaudhari… - Proceedings of the …, 2023 - National Acad Sciences
Despite the great promise that machine learning has offered in many fields of medicine, it has also raised concerns about potential biases and poor generalization across genders …
This paper introduces Tiramisu, a polyhedral framework designed to generate high performance code for multiple platforms including multicores, GPUs, and distributed …
We present a new algorithm to automatically schedule Halide programs for high- performance image processing and deep learning. We significantly improve upon the …
N Rotem, J Fix, S Abdulrasool, G Catron… - arXiv preprint arXiv …, 2018 - arxiv.org
This paper presents the design of Glow, a machine learning compiler for heterogeneous hardware. It is a pragmatic approach to compilation that enables the generation of highly …
Transformers are one of the most important machine learning workloads today. Training one is a very compute-intensive task, often taking days or weeks, and significant attention has …