Machine learning for anomaly detection: A systematic review

AB Nassif, MA Talib, Q Nasir, FM Dakalbab - Ieee Access, 2021 - ieeexplore.ieee.org
Anomaly detection has been used for decades to identify and extract anomalous
components from data. Many techniques have been used to detect anomalies. One of the …

Sage: practical and scalable ML-driven performance debugging in microservices

Y Gan, M Liang, S Dev, D Lo, C Delimitrou - Proceedings of the 26th …, 2021 - dl.acm.org
Cloud applications are increasingly shifting from large monolithic services to complex
graphs of loosely-coupled microservices. Despite the advantages of modularity and …

Microrca: Root cause localization of performance issues in microservices

L Wu, J Tordsson, E Elmroth… - NOMS 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org
Software architecture is undergoing a transition from monolithic architectures to
microservices to achieve resilience, agility and scalability in software development …

A survey of aiops methods for failure management

P Notaro, J Cardoso, M Gerndt - ACM Transactions on Intelligent …, 2021 - dl.acm.org
Modern society is increasingly moving toward complex and distributed computing systems.
The increase in scale and complexity of these systems challenges O&M teams that perform …

Microscope: Pinpoint performance issues with causal graphs in micro-service environments

JJ Lin, P Chen, Z Zheng - … , ICSOC 2018, Hangzhou, China, November 12 …, 2018 - Springer
Driven by the emerging business models (eg, digital sales) and IT technologies (eg, DevOps
and Cloud computing), the architecture of software is shifting from monolithic to microservice …

Imdiffusion: Imputed diffusion models for multivariate time series anomaly detection

Y Chen, C Zhang, M Ma, Y Liu, R Ding, B Li… - arXiv preprint arXiv …, 2023 - arxiv.org
Anomaly detection in multivariate time series data is of paramount importance for ensuring
the efficient operation of large-scale systems across diverse domains. However, accurately …

Towards Trustworthy Machine Learning in Production: An Overview of the Robustness in MLOps Approach

F Bayram, BS Ahmed - ACM Computing Surveys, 2024 - dl.acm.org
Artificial intelligence (AI), and especially its sub-field of Machine Learning (ML), are
impacting the daily lives of everyone with their ubiquitous applications. In recent years, AI …

Unsupervised sequential outlier detection with deep architectures

W Lu, Y Cheng, C Xiao, S Chang… - IEEE transactions on …, 2017 - ieeexplore.ieee.org
Unsupervised outlier detection is a vital task and has high impact on a wide variety of
applications domains, such as image analysis and video surveillance. It also gains long …

Robust and accurate performance anomaly detection and prediction for cloud applications: a novel ensemble learning-based framework

R Xin, H Liu, P Chen, Z Zhao - Journal of Cloud Computing, 2023 - Springer
Effectively detecting run-time performance anomalies is crucial for clouds to identify
abnormal performance behavior and forestall future incidents. To be used for real-world …

Diagnosing performance variations in HPC applications using machine learning

O Tuncer, E Ates, Y Zhang, A Turk, J Brandt… - … Conference, ISC High …, 2017 - Springer
With the growing complexity and scale of high performance computing (HPC) systems,
application performance variation has become a significant challenge in efficient and …