Reproducibility in scientific computing

P Ivie, D Thain - ACM Computing Surveys (CSUR), 2018 - dl.acm.org
Reproducibility is widely considered to be an essential requirement of the scientific process.
However, a number of serious concerns have been raised recently, questioning whether …

[HTML][HTML] Sustainable data analysis with Snakemake

F Mölder, KP Jablonski, B Letcher, MB Hall… - …, 2021 - ncbi.nlm.nih.gov
Data analysis often entails a multitude of heterogeneous steps, from the application of
various command line tools to the usage of scripting languages like R or Python for the …

Lessons learned from the chameleon testbed

K Keahey, J Anderson, Z Zhen, P Riteau… - 2020 USENIX annual …, 2020 - usenix.org
The Chameleon testbed is a case study in adapting the cloud paradigm for computer
science research. In this paper, we explain how this adaptation was achieved, evaluate it …

Veremi: A dataset for comparable evaluation of misbehavior detection in vanets

RW Van Der Heijden, T Lukaseder, F Kargl - Security and Privacy in …, 2018 - Springer
Vehicular networks are networks of communicating vehicles, a major enabling technology
for future cooperative and autonomous driving technologies. The most important messages …

A maturity model for DevOps

D Teixeira, R Pereira, T Henriques… - … Journal of Agile …, 2020 - inderscienceonline.com
Nowadays, businesses aim to respond to customer needs at unprecedented speed. Thus,
many companies are rushing to the DevOps movement. DevOps is the combination of …

[HTML][HTML] Topic modeling in software engineering research

CC Silva, M Galster, F Gilson - Empirical Software Engineering, 2021 - Springer
Topic modeling using models such as Latent Dirichlet Allocation (LDA) is a text mining
technique to extract human-readable semantic “topics”(ie, word clusters) from a corpus of …

A tale of two systems: Using containers to deploy HPC applications on supercomputers and clouds

AJ Younge, K Pedretti, RE Grant… - 2017 IEEE International …, 2017 - ieeexplore.ieee.org
Containerization, or OS-level virtualization has taken root within the computing industry.
However, container utilization and its impact on performance and functionality within High …

Is big data performance reproducible in modern cloud networks?

A Uta, A Custura, D Duplyakin, I Jimenez… - … USENIX symposium on …, 2020 - usenix.org
Performance variability has been acknowledged as a problem for over a decade by cloud
practitioners and performance engineers. Yet, our survey of top systems conferences …

Raphtory: Streaming analysis of distributed temporal graphs

B Steer, F Cuadrado, R Clegg - Future Generation Computer Systems, 2020 - Elsevier
Temporal graphs capture the development of relationships within data throughout time. This
model fits naturally within a streaming architecture, where new events can be inserted …

Collective knowledge: organizing research projects as a database of reusable components and portable workflows with common interfaces

G Fursin - … Transactions of the Royal Society A, 2021 - royalsocietypublishing.org
This article provides the motivation and overview of the Collective Knowledge Framework
(CK or cKnowledge). The CK concept is to decompose research projects into reusable …