Load balancing and server consolidation in cloud computing environments: a meta-study

M Ala'Anzy, M Othman - IEEE Access, 2019 - ieeexplore.ieee.org
The data-center is considered the heart of cloud computing. Recently, the growing demand
for cloud computing services has caused a growing load on data centers. In terms of system …

Software-hardware embedded system reliability modeling with failure dependency and masked data

Z Zheng, J Yang, J Huang - Computers & Industrial Engineering, 2023 - Elsevier
Traditional system reliability models often ignore failure dependency between subsystems
and the existence of system failure masked data, thereby they can't accurately reflect the …

A study of service reliability and availability for distributed systems

YS Dai, M Xie, KL Poh, GQ Liu - Reliability Engineering & System Safety, 2003 - Elsevier
Distributed systems are usually designed and developed to provide certain important
services such as in computing and communication systems. In this paper, a general model is …

Survey of combined hardware–software reliability prediction approaches from architectural and system failure viewpoint

S Sinha, NK Goyal, R Mall - International Journal of System Assurance …, 2019 - Springer
Apart from hardware and software-specific failures, failures arising from hardware–software
interaction causes notorious system failures. Researches have reported two types of …

A model for availability analysis of distributed software/hardware systems

CD Lai, M Xie, KL Poh, YS Dai, P Yang - Information and software …, 2002 - Elsevier
System availability is a major performance concern in distributed systems design and
analysis. A typical kind of application on distributed systems has a homogeneously …

Reliability modeling of hardware and software interactions, and its applications

X Teng, H Pham, DR Jeske - IEEE Transactions on Reliability, 2006 - ieeexplore.ieee.org
We classify system failures into three categories: hardware failures, software failures, and
hardware-software interaction failures. We develop a unified reliability model that accounts …

Meaningful availability

T Hauer, P Hoffmann, J Lunney, D Ardelean… - … USENIX Symposium on …, 2020 - usenix.org
High availability is a critical requirement for cloud applications: if a system does not have
high availability, users cannot count on it for their critical work. Having a metric that …

Modeling and analysis of correlated software failures of multiple types

YS Dai, M Xie, KL Poh - IEEE Transactions on Reliability, 2005 - ieeexplore.ieee.org
Most software reliability models assume independence of successive software runs. It is a
strict assumption, and usually not valid in reality. Goseva-Popstojanova & Trivedi (2000) …

Designing and modelling selective replication for fault-tolerant hpc applications

O Subasi, G Yalcin, F Zyulkyarov… - 2017 17th IEEE/ACM …, 2017 - ieeexplore.ieee.org
Fail-stop errors and Silent Data Corruptions (SDCs) are the most common failure modes for
High Performance Computing (HPC) applications. There are studies that address fail-stop …

A novel system reliability modeling of hardware, software, and interactions of hardware and software

M Zhu, H Pham - Mathematics, 2019 - mdpi.com
In the past few decades, a great number of hardware and software reliability models have
been proposed to address hardware failures in hardware subsystems and software failures …