In-memory parallel processing of massive remotely sensed data using an apache spark on hadoop yarn model

W Huang, L Meng, D Zhang… - IEEE Journal of Selected …, 2016 - ieeexplore.ieee.org
MapReduce has been widely used in Hadoop for parallel processing larger-scale data for
the last decade. However, remote-sensing (RS) algorithms based on the programming …

Optimization in the catalyst optimizer of Spark SQL

M Chawla, V Baniwal - Turkish Journal of Electrical …, 2018 - journals.tubitak.gov.tr
Apache Spark is one of the most technically challenged frameworks for cluster computing in
which data are processed in a parallel fashion. The cluster consists of unreliable machines …

Finding the best Box-Cox transformation from massive datasets on Spark

H Fang, B Yang, T Zhang - … Conference on Big Data (Big Data), 2017 - ieeexplore.ieee.org
In order to find the best linear regression model or polynomial regression model that fits the
data, traditional methods have to read the whole datasets repetitively and incur many …

Improving Big Data Box-Cox Transformation on Spark

H Fang - 2017 - search.proquest.com
This study investigates improving Spark computation with Box-Cox Information Array when it
is used to implement the linear regression models. In order to find the best linear regression …

AQuSerM 2006: Advances in Quality of Service Management

I Poernomo, G Wang - 2006 10th IEEE International Enterprise …, 2006 - computer.org
In order to find the best linear regression model or polynomial regression model that fits the
data, traditional methods have to read the whole datasets repetitively and incur many …