Large Language Models (LLMs) such as GPTs and LLaMa have ushered in a revolution in machine intelligence, owing to their exceptional capabilities in a wide range of machine …
Real-time depth estimation is critical for the increasingly popular augmented reality and virtual reality applications on mobile devices. Yet existing solutions are insufficient as they …
On-device Deep Neural Network (DNN) inference consumes significant computing resources and development efforts. To alleviate that, we propose LUT-NN, the first system to …
H Park, FX Lin - Proceedings of the Eighteenth European Conference …, 2023 - dl.acm.org
For mobile devices, it is compelling to run sensitive GPU computation within a TrustZone trusted execution environment (TEE). To minimize GPU software deployed in TEE, the …
J Zhang, C Zhou, H Yang, D Zhang… - IEEE Consumer …, 2024 - ieeexplore.ieee.org
The rapid advancement of edge artificial intelligence (AI) can be attributed to the widespread use of edge consumer devices and the enhancement in System-on-Chip (SoC) capabilities …
R Liu, Y Leng, S Tian, S Hu, CF Chen… - Proceedings of the 22nd …, 2024 - dl.acm.org
Recent advancements in exploring machine learning models' dynamic spatial sparsity have demonstrated great potential for superior efficiency and adaptability without compromising …
F Jia, S Jiang, T Cao, W Cui, T Xia, X Cao, Y Li… - Proceedings of the …, 2024 - dl.acm.org
Web is increasingly becoming the primary platform to deliver AI services onto edge devices, making in-browser deep learning (DL) inference more prominent. Nevertheless, the …
This work is motivated by recent developments in Deep Neural Networks, particularly the Transformer architectures underlying applications such as ChatGPT, and the need for …
X Qin, Y Li, F Lin, W Li - Journal of Systems Architecture, 2025 - Elsevier
This paper introduces NLTSP, a deep learning-based cost model designed to optimize tensor program performance in deep learning compilers. NLTSP, short for Nested Loop Tree …