Spright: Extracting the server from serverless computing! high-performance ebpf-based event-driven, shared-memory processing

S Qi, L Monis, Z Zeng, I Wang… - Proceedings of the ACM …, 2022 - dl.acm.org
Serverless computing promises an efficient, low-cost compute capability in cloud
environments. However, existing solutions, epitomized by open-source platforms such as …

[HTML][HTML] Evaluating the Role of Machine Learning in Defense Applications and Industry

EJ Alcántara Suárez, V Monzon Baeza - Machine Learning and …, 2023 - mdpi.com
Machine learning (ML) has become a critical technology in the defense sector, enabling the
development of advanced systems for threat detection, decision making, and autonomous …

A survey of GPU multitasking methods supported by hardware architecture

C Zhao, W Gao, F Nie, H Zhou - IEEE Transactions on Parallel …, 2021 - ieeexplore.ieee.org
The ability to support multitasking becomes more and more important in the development of
graphic processing unit (GPU). GPU multitasking methods are classified into three types …

Server load and network-aware adaptive deep learning inference offloading for edge platforms

J Ahn, Y Lee, J Ahn, JG Ko - Internet of Things, 2023 - Elsevier
This work presents DIAMOND, a deep neural network computation offloading scheme
consisting of a lightweight client-to-server latency profiling component combined with a …

Fine-grained accelerator partitioning for Machine Learning and Scientific Computing in Function as a Service Platform

A Dhakal, P Raith, L Ward, RP Hong Enriquez… - Proceedings of the SC' …, 2023 - dl.acm.org
Function-as-a-service (FaaS) is a promising execution environment for high-performance
computing (HPC) and machine learning (ML) applications as it offers developers a simple …

[PDF][PDF] 面向实时视频流分析的边缘计算技术

杨铮, 贺骁武, 吴家行, 王需, 赵毅 - 中国科学(信息科学), 2022 - researchgate.net
摘要实时视频流分析在智能监控, 智慧城市, 自动驾驶等场景中具有重要价值. 然而计算负载高,
带宽需求大, 延迟要求严等特点使得实时视频流分析难以通过传统的云计算范式进行部署 …

: High-Performance eBPF-Based Event-Driven, Shared-Memory Processing for Serverless Computing

S Qi, L Monis, Z Zeng, IC Wang… - … /ACM Transactions on …, 2024 - ieeexplore.ieee.org
Serverless computing promises an efficient, low-cost compute capability in cloud
environments. However, existing solutions, epitomized by open-source platforms such as …

Slice-tune: A system for high performance dnn autotuning

A Dhakal, KK Ramakrishnan, SG Kulkarni… - Proceedings of the 23rd …, 2022 - dl.acm.org
Autotuning DNN models prior to their deployment is an essential but time-consuming task.
Using expensive (and power-hungry) GPU and TPU accelerators efficiently is also key …

Towards Optimal Preemptive GPU Time-Sharing for Edge Model Serving

Z Xia, Y Hao, J Duan, C Wang, J Jiang - Proceedings of the 9th …, 2023 - dl.acm.org
With GPUs increasingly shared by DNN models at the edge, a crucial tradeoff arises
between high GPU utilization and the ability of fast preemption when a high-priority request …