MODIST: Transparent model checking of unmodified distributed systems

W Cui, X Ge, B Kasikci, B Niu, U Sharma… - … USENIX Symposium on …, 2018 - usenix.org

Debugging software failures in deployed systems is important because they impact real
users and customers. However, debugging such failures is notoriously hard in practice …

被引用次数：102 相关文章所有 18 个版本

[PDF] usenix.org

An analysis of {Network-Partitioning} failures in cloud systems

A Alquraan, H Takruri, M Alfatafta… - 13th USENIX Symposium …, 2018 - usenix.org

We present a comprehensive study of 136 system failures attributed to network-partitioning
faults from 25 widely used distributed systems. We found that the majority of the failures led …

被引用次数：88 相关文章所有 15 个版本

[PDF] tcse.cn

An empirical study on crash recovery bugs in large-scale distributed systems

Y Gao, W Dou, F Qin, C Gao, D Wang, J Wei… - Proceedings of the …, 2018 - dl.acm.org

In large-scale distributed systems, node crashes are inevitable, and can happen at any time.
As such, distributed systems are usually designed to be resilient to these node crashes via …

被引用次数：52 相关文章所有 4 个版本

[PDF] illinois.edu

Survivability: design, formal modeling, and validation of cloud storage systems using Maude

R Bobba, J Grov, I Gupta, S Liu… - Assured cloud …, 2018 - books.google.com

To deal with large amounts of data while offering high availability, throughput, and low
latency, cloud computing systems rely on distributed, partitioned, and replicated data stores …

被引用次数：43 相关文章所有 6 个版本

[PDF] acm.org

Inferring and asserting distributed system invariants

S Grant, H Cech, I Beschastnikh - Proceedings of the 40th International …, 2018 - dl.acm.org

Distributed systems are difficult to debug and understand. A key reason for this is distributed
state, which is not easily accessible and must be pieced together from the states of the …

被引用次数：41 相关文章所有 9 个版本

[PDF] acm.org

FCatch: Automatically detecting time-of-fault bugs in cloud systems

H Liu, X Wang, G Li, S Lu, F Ye, C Tian - ACM SIGPLAN Notices, 2018 - dl.acm.org

It is crucial for distributed systems to achieve high availability. Unfortunately, this is
challenging given the common component failures (ie, faults). Developers often cannot …

被引用次数：41 相关文章所有 4 个版本

[PDF] lujie.ac.cn

Cloudraid: hunting concurrency bugs in the cloud via log-mining

J Lu, F Li, L Li, X Feng - Proceedings of the 2018 26th ACM joint meeting …, 2018 - dl.acm.org

Cloud systems suffer from distributed concurrency bugs, which are notoriously difficult to
detect and often lead to data loss and service outage. This paper presents CloudRaid, a …

被引用次数：32 相关文章所有 2 个版本

[PDF] acm.org

Compositional programming and testing of dynamic distributed systems

A Desai, A Phanishayee, S Qadeer… - Proceedings of the ACM …, 2018 - dl.acm.org

A real-world distributed system is rarely implemented as a standalone monolithic system.
Instead, it is composed of multiple independent interacting components that together ensure …

被引用次数：34 相关文章所有 9 个版本

[PDF] springer.com

Partial order aware concurrency sampling

X Yuan, J Yang, R Gu - … : 30th International Conference, CAV 2018, Held …, 2018 - Springer

We present POS, a concurrency testing approach that samples the partial order of
concurrent programs. POS uses a novel priority-based scheduling algorithm that …

被引用次数：21 相关文章所有 11 个版本

[PDF] github.io

Combining model checking and testing

P Godefroid, K Sen - Handbook of Model Checking, 2018 - Springer

Abstract Model checking and testing have a lot in common. Over the last two decades,
significant progress has been made on how to broaden the scope of model checking from …

被引用次数：29 相关文章所有 7 个版本

高级搜索

QQ 群