Efficient Multivariate Time Series Anomaly Detection Through Transfer Learning for Large-Scale Software Systems

Y Sun, M Liang, S Zhang, Z Che, Z Luo, D Li… - ACM Transactions on …, 2024 - dl.acm.org
Timely anomaly detection of multivariate time series (MTS) is of vital importance for
managing large-scale software systems. However, many deep learning-based MTS …

Multivariate Time Series Anomaly Detection based on Pre-trained Models with Dual-Attention Mechanism

Y Sun, Y Guo, M Liang, X Wen, J Kuang… - 2024 IEEE 35th …, 2024 - ieeexplore.ieee.org
In major tech companies, monitoring server performance data with anomaly detection
algorithms is crucial for assessing operational status. Existing models often require separate …

fKPISelect: Fault-Injection Based Automated KPI Selection for Practical Multivariate Anomaly Detection

X Zhang, Y Zhao, C Liu, L Wang, X Yang… - 2023 IEEE 34th …, 2023 - ieeexplore.ieee.org
IT services are now popularly hosted in cloud systems. In order to enhance the availability of
cloud services, an emerging approach for detecting failures of cloud components is to …

ENOVA: Autoscaling towards Cost-effective and Stable Serverless LLM Serving

T Huang, P Chen, K Gong, J Hawk, Z Bright… - arXiv preprint arXiv …, 2024 - arxiv.org
Since the increasing popularity of large language model (LLM) backend systems, it is
common and necessary to deploy stable serverless serving of LLM on multi-GPU clusters …

On the Practicability of Deep Learning based Anomaly Detection for Modern Online Software Systems: A Pre-Train-and-Align Framework

Z He, P Chen, Z Zheng - ACM Transactions on Software Engineering and … - dl.acm.org
Operation and maintenance are critical activities in the whole life cycle of modern online
software systems, and anomaly detection is a crucial step of these activities. Recent studies …