Task scheduling in big data-review, research challenges, and prospects

K Govindarajan, S Kamburugamuve… - 2017 Ninth …, 2017 - ieeexplore.ieee.org
In a Big data computing, the processing of data requires a large amount of CPU cycles and
network bandwidth and disk I/O. Dataflow is a programming model for processing Big data …

Coded elastic computing

Y Yang, M Interlandi, P Grover, S Kar… - 2019 IEEE …, 2019 - ieeexplore.ieee.org
Cloud providers have recently introduced new offerings whereby spare computing
resources are accessible at discounts compared to on-demand computing. Exploiting such …

Performance prediction of data streams on high-performance architecture

B Gautam, A Basava - Human-centric Computing and Information …, 2019 - Springer
Worldwide sensor streams are expanding continuously with unbounded velocity in volume,
and for this acceleration, there is an adaptation of large stream data processing system from …

Harmony: A scheduling framework optimized for multiple distributed machine learning jobs

WY Lee, Y Lee, WW Song, Y Yang… - 2021 IEEE 41st …, 2021 - ieeexplore.ieee.org
We introduce Harmony, a new scheduling framework that executes multiple Parameter-
Server ML training jobs together to improve cluster resource utilization. Harmony …

Role of apache software foundation in big data projects

A Akhtar - arXiv preprint arXiv:2005.02829, 2020 - arxiv.org
With the increase in amount of Big Data being generated each year, tools and technologies
developed and used for the purpose of storing, processing and analyzing Big Data has also …

Scalable multi-framework multi-tenant lifecycle management of deep learning applications

JK Radhakrishnan, V Muthusamy, V Isahagian… - US Patent …, 2021 - Google Patents
(57) ABSTRACT A lifecycle management method, system, and computer program product
include establishing a public key infra structure (PKI) for end-to-end encryption of control …

Scalable multi-framework multi-tenant lifecycle management of deep learning applications

JK Radhakrishnan, V Muthusamy, V Isahagian… - US Patent …, 2022 - Google Patents
2019-03-21 Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION
reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF …

Robust class parallelism-error resilient parallel inference with low communication cost

Y Yang, J Chung, G Wang, V Gupta… - 2020 54th Asilomar …, 2020 - ieeexplore.ieee.org
Model parallelism is a standard paradigm to decouple a deep neural network (DNN) into
sub-nets when the model is large. Recent advances in class parallelism significantly reduce …

[图书][B] Coded computing systems decoded: dealing with unreliability and elasticity in modern computing

Y Yang - 2019 - search.proquest.com
Robustness is a fundamental and timeless issue, and it remains vital to all aspects of
computation systems, regardless of specific computation platforms, architectures, and …

Runtime Optimization Techniques for Resource-Efficient Execution of Distributed Machine Learning

이우연 - 2020 - s-space.snu.ac.kr
We build a working system that implements our approaches. The above two solutions are
implemented in the same system and share the runtime part that can dynamically migrate …