Cloud services are omnipresent and critical cloud service failure is a fact of life. In order to retain customers and prevent revenue loss, it is important to provide high reliability …
The task of root cause analysis (RCA) is to identify the root causes of system faults/failures by analyzing system monitoring data. Efficient RCA can greatly accelerate system failure …
Y Li, Y Lu, J Wang, Q Qi, J Wang… - … on Software Analysis …, 2023 - ieeexplore.ieee.org
Due to the complexity of microservice architecture, it is difficult to accomplish efficient microservice anomaly detection and localization tasks and achieve the target of high system …
M Hardt, W Orchard, P Blöbaum… - arXiv preprint arXiv …, 2023 - arxiv.org
Identifying root causes for unexpected or undesirable behavior in complex systems is a prevalent challenge. This issue becomes especially crucial in modern cloud applications …
A Trilla, O Yiboe, N Mijatovic, J Vitrià - arXiv preprint arXiv:2407.20700, 2024 - arxiv.org
This paper describes the development of a causal diagnosis approach for troubleshooting an industrial environment on the basis of the technical language expressed in Return on …
Recent work conceptualized root cause analysis (RCA) of anomalies via quantitative contribution analysis using causal counterfactuals in structural causal models (SCMs). The …
J Yang, Y Guo, Y Chen, Y Zhao - Applied Sciences, 2023 - mdpi.com
Microservice architecture has been widely adopted by large-scale applications. Due to the huge amount of data and complex microservice dependency, it also poses new challenges …
Runtime failure and performance degradation is commonplace in modern cloud systems. For cloud providers, automatically determining the root cause of incidents is paramount to …
Under a distributed information system, the scale of various operational components such as applications, operating systems, databases, servers, and networks is immense, with intricate …