Anomaly detection and failure root cause analysis in (micro) service-based cloud applications: A survey

J Soldani, A Brogi - ACM Computing Surveys (CSUR), 2022 - dl.acm.org
The proliferation of services and service interactions within microservices and cloud-native
applications, makes it harder to detect failures and to identify their possible root causes …

Ai for it operations (aiops) on cloud platforms: Reviews, opportunities and challenges

Q Cheng, D Sahoo, A Saha, W Yang, C Liu… - arXiv preprint arXiv …, 2023 - arxiv.org
Artificial Intelligence for IT operations (AIOps) aims to combine the power of AI with the big
data generated by IT Operations processes, particularly in cloud infrastructures, to provide …

Microrca: Root cause localization of performance issues in microservices

L Wu, J Tordsson, E Elmroth… - NOMS 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org
Software architecture is undergoing a transition from monolithic architectures to
microservices to achieve resilience, agility and scalability in software development …

Root cause analysis of failures in microservices through causal discovery

A Ikram, S Chakraborty, S Mitra… - Advances in …, 2022 - proceedings.neurips.cc
Most cloud applications use a large number of smaller sub-components (called
microservices) that interact with each other in the form of a complex graph to provide the …

Microscope: Pinpoint performance issues with causal graphs in micro-service environments

JJ Lin, P Chen, Z Zheng - … , ICSOC 2018, Hangzhou, China, November 12 …, 2018 - Springer
Driven by the emerging business models (eg, digital sales) and IT technologies (eg, DevOps
and Cloud computing), the architecture of software is shifting from monolithic to microservice …

Localizing failure root causes in a microservice through causality inference

Y Meng, S Zhang, Y Sun, R Zhang, Z Hu… - 2020 IEEE/ACM 28th …, 2020 - ieeexplore.ieee.org
An increasing number of Internet applications are applying microservice architecture due to
its flexibility and clear logic. The stability of microservice is thus vitally important for these …

Eadro: An end-to-end troubleshooting framework for microservices on multi-source data

C Lee, T Yang, Z Chen, Y Su… - 2023 IEEE/ACM 45th …, 2023 - ieeexplore.ieee.org
The complexity and dynamism of microservices pose significant challenges to system
reliability, and thereby, automated troubleshooting is crucial. Effective root cause localization …

Microrank: End-to-end latency issue localization with extended spectrum analysis in microservice environments

G Yu, P Chen, H Chen, Z Guan, Z Huang… - Proceedings of the Web …, 2021 - dl.acm.org
With the advantages of flexible scalability and fast delivery, microservice has become a
popular software architecture in the modern IT industry. However, the explosion in the …

Practical root cause localization for microservice systems via trace analysis

Z Li, J Chen, R Jiao, N Zhao, Z Wang… - 2021 IEEE/ACM 29th …, 2021 - ieeexplore.ieee.org
Microservice architecture is applied by an increasing number of systems because of its
benefits on delivery, scalability, and autonomy. It is essential but challenging to localize root …

Automap: Diagnose your microservice-based web applications automatically

M Ma, J Xu, Y Wang, P Chen, Z Zhang… - Proceedings of The Web …, 2020 - dl.acm.org
The high complexity and dynamics of the microservice architecture make its application
diagnosis extremely challenging. Static troubleshooting approaches may fail to obtain …