Big data in cloud computing: A resource management perspective

S Ullah, MD Awan… - Scientific …, 2018 - Wiley Online Library
The modern day advancement is increasingly digitizing our lives which has led to a rapid
growth of data. Such multidimensional datasets are precious due to the potential of …

Performance evaluation of big data frameworks for large-scale data analytics

J Veiga, RR Expósito, XC Pardo… - … Conference on Big …, 2016 - ieeexplore.ieee.org
The increasing adoption of Big Data analytics has led to a high demand for efficient
technologies in order to manage and process large datasets. Popular MapReduce …

Processing of big heterogeneous genomic datasets for tertiary analysis of Next Generation Sequencing data

M Masseroli, A Canakoglu, P Pinoli, A Kaitoua… - …, 2019 - academic.oup.com
Motivation We previously proposed a paradigm shift in genomic data management, based
on the Genomic Data Model (GDM) for mediating existing data formats and on the …

Comparative evaluation of big-data systems on scientific image analytics workloads

P Mehta, S Dorkenwald, D Zhao, T Kaftan… - arXiv preprint arXiv …, 2016 - arxiv.org
Scientific discoveries are increasingly driven by analyzing large volumes of image data.
Many new libraries and specialized database management systems (DBMSs) have …

A comparison of stream processing frameworks

Z Karakaya, A Yazici, M Alayyoub - … Conference on Computer …, 2017 - ieeexplore.ieee.org
This study compares the performance of Big Data Stream Processing frameworks including
Apache Spark, Flink, and Storm. Also, it measures the resource usage and performance …

On the usability of Hadoop MapReduce, Apache Spark & Apache flink for data science

B Akil, Y Zhou, U Röhm - … Conference on Big Data (Big Data), 2017 - ieeexplore.ieee.org
Distributed data processing platforms for cloud computing are important tools for large-scale
data analytics. Apache Hadoop MapReduce has become the de facto standard in this …

BDEv 3.0: energy efficiency and microarchitectural characterization of Big Data processing frameworks

J Veiga, J Enes, RR Expósito, J Tourino - Future Generation Computer …, 2018 - Elsevier
As the size of Big Data workloads keeps increasing, the evaluation of distributed frameworks
becomes a crucial task in order to identify potential performance bottlenecks that may delay …

Framework for supporting genomic operations

A Kaitoua, P Pinoli, M Bertoni… - IEEE Transactions on …, 2016 - ieeexplore.ieee.org
Next Generation Sequencing (NGS) is a family of technologies for reading the DNA or RNA,
capable of producing whole genome sequences at an impressive speed, and causing a …

Overview of GeCo: a project for exploring and integrating signals from the genome

S Ceri, A Bernasconi, A Canakoglu, A Gulino… - Data Analytics and …, 2018 - Springer
Abstract Next Generation Sequencing is a 10-year old technology for reading the DNA,
capable of producing massive amounts of genomic data-in turn, reshaping genomic …

Data science for genomic data management: challenges, resources, experiences

S Ceri, P Pinoli - SN Computer Science, 2020 - Springer
We highlight several challenges which are faced by data scientists who use public datasets
for solving biological and clinical problems. In spite of the large efforts in building such …