A survey of techniques for optimizing deep learning on GPUs

S Mittal, S Vaishay - Journal of Systems Architecture, 2019 - Elsevier
The rise of deep-learning (DL) has been fuelled by the improvements in accelerators. Due to
its unique features, the GPU continues to remain the most widely used accelerator for DL …

Zerogen: Efficient zero-shot learning via dataset generation

J Ye, J Gao, Q Li, H Xu, J Feng, Z Wu, T Yu… - arXiv preprint arXiv …, 2022 - arxiv.org
There is a growing interest in dataset generation recently due to the superior generative
capacity of large pre-trained language models (PLMs). In this paper, we study a flexible and …

A survey on green deep learning

J Xu, W Zhou, Z Fu, H Zhou, L Li - arXiv preprint arXiv:2111.05193, 2021 - arxiv.org
In recent years, larger and deeper models are springing up and continuously pushing state-
of-the-art (SOTA) results across various fields like natural language processing (NLP) and …

Checkmate: Breaking the memory wall with optimal tensor rematerialization

P Jain, A Jain, A Nrusimha, A Gholami… - Proceedings of …, 2020 - proceedings.mlsys.org
Modern neural networks are increasingly bottlenecked by the limited capacity of on-device
GPU memory. Prior work explores dropping activations as a strategy to scale to larger neural …

Reduce, reuse, recycle: Green information retrieval research

H Scells, S Zhuang, G Zuccon - … of the 45th International ACM SIGIR …, 2022 - dl.acm.org
Recent advances in Information Retrieval utilise energy-intensive hardware to produce state-
of-the-art results. In areas of research highly related to Information Retrieval, such as Natural …

Scrooge: A cost-effective deep learning inference system

Y Hu, R Ghosh, R Govindan - Proceedings of the ACM Symposium on …, 2021 - dl.acm.org
Advances in deep learning (DL) have prompted the development of cloud-hosted DL-based
media applications that process video and audio streams in real-time. Such applications …

Rim: Offloading inference to the edge

Y Hu, W Pang, X Liu, R Ghosh, B Ko, WH Lee… - Proceedings of the …, 2021 - dl.acm.org
Video cameras are among the most ubiquitous sensors in the Internet-of-Things. Video and
audio applications, such as cross-camera activity detection, avatar extraction or language …

A green (er) world for ai

D Zhao, NC Frey, J McDonald… - 2022 IEEE …, 2022 - ieeexplore.ieee.org
As research and practice in artificial intelligence (AI) grow in leaps and bounds, the
resources necessary to sustain and support their operations also grow at an increasing …

Irina: Accelerating DNN inference with efficient online scheduling

X Wu, H Xu, Y Wang - Proceedings of the 4th Asia-Pacific Workshop on …, 2020 - dl.acm.org
DNN inference is becoming prevalent for many real-world applications. Current machine
learning frameworks usually schedule inference tasks with the goal of optimizing throughput …

GreenSeq: Automatic Design of Green Networks for Sequential Recommendation Systems

Y Ren, X Yang, X Lu, L Li, J Zhou, J Gu… - Proceedings of the 46th …, 2023 - dl.acm.org
Transformer-based models have achieved tremendous success in sequential
recommendation (SR), but they suffer from consuming excessive computational resources …