A checkpoint of research on parallel i/o for high-performance computing

FZ Boito, EC Inacio, JL Bez, POA Navaux… - ACM Computing …, 2018 - dl.acm.org
We present a comprehensive survey on parallel I/O in the high-performance computing
(HPC) context. This is an important field for HPC because of the historic gap between …

Fusionfs: Toward supporting data-intensive scientific applications on extreme-scale high-performance computing systems

D Zhao, Z Zhang, X Zhou, T Li, K Wang… - … conference on big …, 2014 - ieeexplore.ieee.org
State-of-the-art, yet decades-old, architecture of high-performance computing systems has
its compute and storage resources separated. It thus is limited for modern data-intensive …

Hyperdrive: Exploring hyperparameters with pop scheduling

J Rasley, Y He, F Yan, O Ruwase… - Proceedings of the 18th …, 2017 - dl.acm.org
The quality of machine learning (ML) and deep learning (DL) models are very sensitive to
many different adjustable parameters that are set before training even begins, commonly …

Towards exploring data-intensive scientific applications at extreme scales through systems and simulations

D Zhao, N Liu, D Kimpe, R Ross… - IEEE Transactions on …, 2015 - ieeexplore.ieee.org
The state-of-the-art storage architecture of high-performance computing systems was
designed decades ago, and with today's scale and level of concurrency, it is showing …

VectorSearch: Enhancing Document Retrieval with Semantic Embeddings and Optimized Search

SS Monir, I Lau, S Yang, D Zhao - arXiv preprint arXiv:2409.17383, 2024 - arxiv.org
Traditional retrieval methods have been essential for assessing document similarity but
struggle with capturing semantic nuances. Despite advancements in latent semantic …

Combining Buffered {I/O} and Direct {I/O} in Distributed File Systems

Y Qian, MA Vef, P Farrell, A Dilger, X Li… - … USENIX Conference on …, 2024 - usenix.org
Direct I/O allows I/O requests to bypass the Linux page cache and was introduced over 20
years ago as an alternative to the default buffered I/O mode. However, high-performance …

Improving batch scheduling on blue gene/q by relaxing 5d torus network allocation constraints

Z Zhou, X Yang, Z Lan, P Rich, W Tang… - 2015 IEEE …, 2015 - ieeexplore.ieee.org
As systems scale toward exactable, many resources will become increasingly constrained.
While some of these resources have historically been explicitly allocated, many--such as …

A survey on innovative approach for improvement in efficiency of caching technique for big data application

S Tamboli, SS Patel - 2015 International Conference on …, 2015 - ieeexplore.ieee.org
Big Data is playing important role in scientific, industrial and academic areas. Information is
being generated everyday by millions of computing machines and collected for future use …

High-performance storage support for scientific applications on the cloud

D Zhao, X Yang, I Sadooghi, G Garzoglio… - Proceedings of the 6th …, 2015 - dl.acm.org
Although cloud computing has become one of the most popular paradigms for executing
data-intensive applications (for example, Hadoop), the storage subsystem is not optimized …

Virtual chunks: On supporting random accesses to scientific data in compressible storage systems

D Zhao, J Yin, K Qiao, I Raicu - 2014 IEEE International …, 2014 - ieeexplore.ieee.org
Data compression could ameliorate the I/O pressure of scientific applications on high-
performance computing systems. Unfortunately, the conventional wisdom of naively …