作者
Khanasin Yamnual, Phond Phunchongharn, Tiranee Achalakul
发表日期
2017/5/13
研讨会论文
2017 International Conference on Applied System Innovation (ICASI)
页码范围
568-571
出版商
IEEE
简介
Performance monitoring is essential for all subsystems, especially high performance computing systems. These systems are sensitive to errors and failures which lead to data losses and then severely impact on the organizations. Consequently, resource information in the systems (e.g., CPU usage, memory usage, disk I/O usage, etc.) during the operations must be collected through the system monitoring in order to use for failure identification. However, a traditional monitoring system cannot detect the failures. Since failure discovered later in the operation are more difficult and more expensive to recover, we highly desire to detect the failure as early as possible. In this paper, we propose a proactive failure detection framework based on a monitoring system for the high performance computing systems. Our proposed monitoring system is based on Elasticsearch-Logstash-Kibana (ELK), which has the task of gathering …
引用总数
201820192020202120222023132311
学术搜索中的文章
K Yamnual, P Phunchongharn, T Achalakul - 2017 International Conference on Applied System …, 2017