Modern distributed systems that operate concurrently generate interleaved logs. Identifiers (ID) are always associated with active instances or entities in order to track them in logs …
Timely localization of the root causes of gray failure is essential for maintaining the stability of the server OS. The previous intrusive gray failure localization methods usually require …
L Zhang, Y Shi - Applied Soft Computing, 2023 - Elsevier
Despite the rapid advance of unsupervised reconstruction models in online service fault diagnosis, existing methods still lead to frequent false positive or false negative alarms …
Automated fault localization in large-scale cloud-based applications is challenging because it involves mining multivariate time series data from large volumes of operational monitoring …
Root Cause Analysis (RCA) is a crucial aspect of incident management in large-scale cloud services. While numerous studies have been proposed, existing surveys typically focus on …