Performance anomaly detection and bottleneck identification

O Ibidunmoye, F Hernández-Rodriguez… - ACM Computing Surveys …, 2015 - dl.acm.org
In order to meet stringent performance requirements, system administrators must effectively
detect undesirable performance behaviours, identify potential root causes, and take …

Adaptive anomaly identification by exploring metric subspace in cloud computing infrastructures

Q Guan, S Fu - 2013 IEEE 32nd International Symposium on …, 2013 - ieeexplore.ieee.org
Cloud computing has become increasingly popular by obviating the need for users to own
and maintain complex computing infrastructures. However, due to their inherent complexity …

Toward automated anomaly identification in large-scale systems

Z Lan, Z Zheng, Y Li - IEEE Transactions on Parallel and …, 2009 - ieeexplore.ieee.org
When a system fails to function properly, health-related data are collected for
troubleshooting. However, it is challenging to effectively identify anomalies from the …

Failure analysis of distributed scientific workflows executing in the cloud

T Samak, D Gunter, M Goode… - … on network and …, 2012 - ieeexplore.ieee.org
This work presents models characterizing failures observed during the execution of large
scientific applications on Amazon EC2. Scientific workflows are used as the underlying …

Online fault and anomaly detection for large-scale scientific workflows

T Samak, D Gunter, M Goode… - … Conference on High …, 2011 - ieeexplore.ieee.org
Scientific workflows are an enabler of complex scientific analyses. Large-scale scientific
workflows are executed on complex parallel and distributed resources, where many things …

Ontimedetect: Dynamic network anomaly notification in perfsonar deployments

P Calyam, J Pu, W Mandrawa… - … on Modeling, Analysis …, 2010 - ieeexplore.ieee.org
To monitor and diagnose bottlenecks on network paths used for large-scale data transfers,
there is an increasing trend to deploy measurement frameworks such as perfSONAR. These …

Fourier transformation autoencoders for anomaly detection

D Lappas, V Argyriou, D Makris - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
Anomaly detection is a challenging problem, mainly due to the lack of a sufficient set of
abnormal samples that represents every possible anomaly. Therefore unsupervised …

SyncChecker: detecting synchronization errors between MPI applications and libraries

Z Chen, X Li, JY Chen, H Zhong… - 2012 IEEE 26th …, 2012 - ieeexplore.ieee.org
While improving the performance, nonblocking communication is prone to synchronization
errors between MPI applications and the underlying MPI libraries. Such synchronization …

AI‐Driven Performance Management in Data‐Intensive Applications

A Alnafessah, G Russo Russo… - … Management in the …, 2021 - Wiley Online Library
Data‐intensive applications have attracted considerable attention in recent years. Business
organizations are increasingly becoming data‐driven and therefore look for novel ways to …

MicroNet: Operation Aware Root Cause Identification of Microservice System Anomalies

J Yang, Y Guo, Y Chen, Y Zhao - IEEE Transactions on Network …, 2024 - ieeexplore.ieee.org
Microservice architecture has been widely adopted in large-scale applications. However, it
also brings new challenges to ensuring reliable performance and maintenance due to the …