SETSUDŌ: Perturbation-based testing framework for scalable distributed systems

V Heorhiadi, S Rajagopalan, H Jamjoom… - 2016 IEEE 36th …, 2016 - ieeexplore.ieee.org

Modern Internet applications are being disaggregated into a microservice-based
architecture, with services being updated and deployed hundreds of times a day. The …

被引用次数：182 相关文章所有 8 个版本

[PDF] usenix.org

{SAMC}:{Semantic-Aware} Model Checking for Fast Discovery of Deep Bugs in Cloud Systems

T Leesatapornwongsa, M Hao, P Joshi… - … USENIX Symposium on …, 2014 - usenix.org

The last five years have seen a rise of implementationlevel distributed system model
checkers (dmck) for verifying the reliability of real distributed systems. Existing dmcks …

被引用次数：177 相关文章所有 9 个版本

An empirical study on crash recovery bugs in large-scale distributed systems

Y Gao, W Dou, F Qin, C Gao, D Wang, J Wei… - Proceedings of the …, 2018 - dl.acm.org

In large-scale distributed systems, node crashes are inevitable, and can happen at any time.
As such, distributed systems are usually designed to be resilient to these node crashes via …

被引用次数：59 相关文章所有 4 个版本

[PDF] acm.org

Flymc: Highly scalable testing of complex interleavings in distributed systems

JF Lukman, H Ke, CA Stuardo, RO Suminto… - Proceedings of the …, 2019 - dl.acm.org

We present a fast and scalable testing approach for datacenter/cloud systems such as
Cassandra, Hadoop, Spark, and ZooKeeper. The uniqueness of our approach is in its ability …

被引用次数：56 相关文章所有 4 个版本

[PDF] acm.org

Service-level fault injection testing

CS Meiklejohn, A Estrada, Y Song, H Miller… - Proceedings of the …, 2021 - dl.acm.org

Companies today increasingly rely on microservice architectures to deliver service for their
large-scale mobile or web applications. However, not all developers working on these …

被引用次数：23 相关文章所有 4 个版本

[PDF] otago.ac.nz

Crashtuner: Detecting crash-recovery bugs in cloud systems via meta-info analysis

J Lu, C Liu, L Li, X Feng, F Tan, J Yang… - Proceedings of the 27th …, 2019 - dl.acm.org

Crash-recovery bugs (bugs in crash-recovery-related mechanisms) are among the most
severe bugs in cloud systems and can easily cause system failures. It is notoriously difficult …

被引用次数：33 相关文章所有 7 个版本

[PDF] acm.org

A study of failure recovery and logging of high-performance parallel file systems

R Han, OR Gatla, M Zheng, J Cao, D Zhang… - ACM Transactions on …, 2022 - dl.acm.org

Large-scale parallel file systems (PFSs) play an essential role in high-performance
computing (HPC). However, despite their importance, their reliability is much less studied or …

被引用次数：17 相关文章所有 6 个版本

[PDF] acm.org

FCatch: Automatically detecting time-of-fault bugs in cloud systems

H Liu, X Wang, G Li, S Lu, F Ye, C Tian - ACM SIGPLAN Notices, 2018 - dl.acm.org

It is crucial for distributed systems to achieve high availability. Unfortunately, this is
challenging given the common component failures (ie, faults). Developers often cannot …

被引用次数：43 相关文章所有 4 个版本

[PDF] lujie.ac.cn

Cloudraid: hunting concurrency bugs in the cloud via log-mining

J Lu, F Li, L Li, X Feng - Proceedings of the 2018 26th ACM joint meeting …, 2018 - dl.acm.org

Cloud systems suffer from distributed concurrency bugs, which are notoriously difficult to
detect and often lead to data loss and service outage. This paper presents CloudRaid, a …

被引用次数：34 相关文章所有 2 个版本

[PDF] cnrs.fr

Switching gaussian process dynamic models for simultaneous composite motion tracking and recognition

J Chen, M Kim, Y Wang, Q Ji - 2009 IEEE Conference on …, 2009 - ieeexplore.ieee.org

Traditional dynamical systems used for motion tracking cannot effectively handle high
dimensionality of the motion states and composite dynamics. In this paper, to address both …

被引用次数：76 相关文章所有 12 个版本

高级搜索

QQ 群