Pre-training code representation with semantic flow graph for effective bug localization

Y Du, Z Yu - Proceedings of the 31st ACM Joint European Software …, 2023 - dl.acm.org
Enlightened by the big success of pre-training in natural language processing, pre-trained
models for programming languages have been widely used to promote code intelligence in …

Device placement using Laplacian PCA and graph attention networks

M Han, Y Zeng, H Shu, L Yue, J Zhang… - The Computer …, 2024 - academic.oup.com
The exponential growth in data and parameters in modern neural networks has created the
need to distribute these models across multiple devices for efficient training, resulting in the …

Murmuration: On-the-fly DNN Adaptation for SLO-Aware Distributed Inference in Dynamic Edge Environments

J Lin, M Li, SQ Zhang, A Leon-Garcia - Proceedings of the 53rd …, 2024 - dl.acm.org
The proliferation of Virtual and Augmented Reality (VR/AR) and the Internet of Things (IoT)
applications is driving the demand for efficient Deep Neural Network (DNN) inference at the …

A novel device placement approach based on position-aware subgraph neural networks

M Han, Y Zeng, J Zhang, Y Ren, M Xue, M Zhou - Neurocomputing, 2024 - Elsevier
Coping with the growing demand for data and parameters in complex neural network (NN)
models of contemporary times typically involves distributing them across multiple devices …

Automating Cloud Deployment for Real-Time Online Foundation Model Inference

Y Li, Z Li, Z Han, Q Zhang, X Ma - IEEE/ACM Transactions on …, 2023 - ieeexplore.ieee.org
Deep neural network (DNN) foundation models are currently exhibiting high prediction
accuracy and strong adaptability to broad tasks with remarkably large model scales. They …

Moirai: Towards Optimal Placement for Distributed Inference on Heterogeneous Devices

B Zhang, H Zhu, F Gao, Z Yang, SX Wang - arXiv preprint arXiv …, 2023 - arxiv.org
The escalating size of Deep Neural Networks (DNNs) has spurred a growing research
interest in hosting and serving DNN models across multiple devices. A number of studies …

Mercury: Fast and Optimal Device Placement for Large Deep Learning Models

H Xu, P Zhou, H Xie, Y Liao - … of the 52nd International Conference on …, 2023 - dl.acm.org
The rapidly expanding neural network models are becoming increasingly challenging to run
on a single device. Hence, model parallelism over multiple devices is critical to …

A structure-aware framework for learning device placements on computation graphs

S Duan, H Ping, N Kanakaris, X Xiao, P Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Existing approaches for device placement ignore the topological features of computation
graphs and rely mostly on heuristic methods for graph partitioning. At the same time, they …

Aware: Adaptive Distributed Training with Computation, Communication and Position Awareness for Deep Learning Model

Y Zeng, G Yi, Y Yin, J Wu, M Xue… - 2022 IEEE 24th Int …, 2022 - ieeexplore.ieee.org
The accuracy of the neural networks can usually be improved by increasing the size of the
dataset and the layers or operators of the network, as it has strong composability. But, it …

Celeritas: Fast Optimizer for Large Dataflow Graphs

H Xu, Y Liao, H Xie, P Zhou - arXiv preprint arXiv:2208.00184, 2022 - arxiv.org
The rapidly enlarging neural network models are becoming increasingly challenging to run
on a single device. Hence model parallelism over multiple devices is critical to guarantee …