Workflow systems for science: Concepts and tools

D Talia - International Scholarly Research Notices, 2013 - Wiley Online Library
The wide availability of high‐performance computing systems, Grids and Clouds, allowed
scientists and engineers to implement more and more complex applications to access and …

Distributed data mining: a survey

L Zeng, L Li, L Duan, K Lu, Z Shi, M Wang… - Information Technology …, 2012 - Springer
Most data mining approaches assume that the data can be provided from a single source. If
data was produced from many physically distributed locations like Wal-Mart, these methods …

The WEKA data mining software: an update

M Hall, E Frank, G Holmes, B Pfahringer… - ACM SIGKDD …, 2009 - dl.acm.org
More than twelve years have elapsed since the first public release of WEKA. In that time, the
software has been rewritten entirely from scratch, evolved substantially and now …

Active learning for sentiment analysis on data streams: Methodology and workflow implementation in the ClowdFlows platform

J Kranjc, J Smailović, V Podpečan, M Grčar… - Information Processing …, 2015 - Elsevier
Sentiment analysis from data streams is aimed at detecting authors' attitude, emotions and
opinions from texts in real-time. To reduce the labeling effort needed in the data collection …

Xel: A cloud-agnostic data platform for the design-driven building of high-availability data science services

JA Barron-Lugo, JL Gonzalez-Compean… - Future Generation …, 2023 - Elsevier
This paper presents Xel, a cloud-agnostic data platform for the design-driven building of
high-availability data science services as a support tool for data-driven decision-making. We …

ClowdFlows: Online workflows for distributed big data mining

J Kranjc, R Orač, V Podpečan, N Lavrač… - Future Generation …, 2017 - Elsevier
The paper presents a platform for distributed computing, developed using the latest software
technologies and computing paradigms to enable big data mining. The platform, called …

A parallel distributed weka framework for big data mining using spark

AK Koliopoulos, P Yiapanis, F Tekiner… - … congress on big …, 2015 - ieeexplore.ieee.org
Effective Big Data Mining requires scalable and efficient solutions that are also accessible to
users of all levels of expertise. Despite this, many current efforts to provide effective …

GPU-based bees swarm optimization for association rules mining

Y Djenouri, A Bendjoudi, M Mehdi… - The Journal of …, 2015 - Springer
Association rules mining (ARM) is a well-known combinatorial optimization problem aiming
at extracting relevant rules from given large-scale datasets. According to the state of the art …

Toolkit-based high-performance data mining of large data on MapReduce clusters

D Wegener, M Mock, D Adranale… - 2009 IEEE International …, 2009 - ieeexplore.ieee.org
The enormous growth of data in a variety of applications has increased the need for high
performance data mining based on distributed environments. However, standard data …

A semantic framework for automatic generation of computational workflows using distributed data and component catalogues

Y Gil, PA Gonzalez-Calero, J Kim, J Moody… - … of Experimental & …, 2011 - Taylor & Francis
Computational workflows are a powerful paradigm to represent and manage complex
applications, particularly in large-scale distributed scientific data analysis. Workflows …