The family of mapreduce and large-scale data processing systems

S Sakr, A Liu, AG Fayoumi - ACM Computing Surveys (CSUR), 2013 - dl.acm.org
In the last two decades, the continuous increase of computational power has produced an
overwhelming flow of data which has called for a paradigm shift in the computing …

X-stream: Edge-centric graph processing using streaming partitions

A Roy, I Mihailovic, W Zwaenepoel - Proceedings of the Twenty-Fourth …, 2013 - dl.acm.org
X-Stream is a system for processing both in-memory and out-of-core graphs on a single
shared-memory machine. While retaining the scatter-gather programming model with state …

Gps: A graph processing system

S Salihoglu, J Widom - Proceedings of the 25th international conference …, 2013 - dl.acm.org
GPS (for Graph Processing System) is a complete open-source system we developed for
scalable, fault-tolerant, and easy-to-program execution of algorithms on extremely large …

[PDF][PDF] MLbase: A Distributed Machine-learning System.

T Kraska, A Talwalkar, JC Duchi, R Griffith, MJ Franklin… - Cidr, 2013 - i.stanford.edu
Machine learning (ML) and statistical techniques are key to transforming big data into
actionable knowledge. In spite of the modern primacy of data, the complexity of existing ML …

Direction‐optimizing breadth‐first search

S Beamer, K Asanović, D Patterson - Scientific Programming, 2013 - Wiley Online Library
Breadth‐First Search is an important kernel used by many graph‐processing applications. In
many of these emerging applications of BFS, such as analyzing social networks, the input …

MLI: An API for distributed machine learning

ER Sparks, A Talwalkar, V Smith… - 2013 IEEE 13th …, 2013 - ieeexplore.ieee.org
MLI is an Application Programming Interface designed to address the challenges of building
Machine Learning algorithms in a distributed setting based on data-centric computing. Its …

[PDF][PDF] Asynchronous Large-Scale Graph Processing Made Easy.

G Wang, W Xie, AJ Demers, J Gehrke - CIDR, 2013 - academia.edu
Scaling large iterative graph processing applications through parallel computing is a very
important problem. Several graph processing frameworks have been proposed that insulate …

Accelerated mini-batch stochastic dual coordinate ascent

S Shalev-Shwartz, T Zhang - Advances in Neural …, 2013 - proceedings.neurips.cc
Stochastic dual coordinate ascent (SDCA) is an effective technique for solving regularized
loss minimization problems in machine learning. This paper considers an extension of …

Presto: distributed machine learning and graph processing with sparse matrices

S Venkataraman, E Bodzsar, I Roy… - Proceedings of the 8th …, 2013 - dl.acm.org
It is cumbersome to write machine learning and graph algorithms in data-parallel models
such as MapReduce and Dryad. We observe that these algorithms are based on matrix …

Maiter: An asynchronous graph processing framework for delta-based accumulative iterative computation

Y Zhang, Q Gao, L Gao, C Wang - IEEE Transactions on …, 2013 - ieeexplore.ieee.org
Myriad of graph-based algorithms in machine learning and data mining require parsing
relational data iteratively. These algorithms are implemented in a large-scale distributed …