Reproducible floating-point aggregation in RDBMSs

C Liu, H Jiang, J Paparrizos, AJ Elmore - Proceedings of the VLDB …, 2021 - dl.acm.org

Modern data-intensive applications often generate large amounts of low precision float data
with a limited range of values. Despite the prevalence of such data, there is a lack of an …

被引用次数：30 相关文章所有 7 个版本

[PDF] nature.com

Limits of reproducibility and hydrodynamic noise in atmospheric regional modelling

B Geyer, T Ludwig, H von Storch - Communications Earth & …, 2021 - nature.com

Reproducibility of research results is a fundamental quality criterion in science; thus,
computer architecture effects on simulation results must be determined. Here, we investigate …

被引用次数：14 相关文章所有 7 个版本

[PDF] ieee.org

Pushing ML Predictions Into DBMSs

M Paganelli, P Sottovia, K Park… - … on Knowledge and …, 2023 - ieeexplore.ieee.org

In the past decade, many approaches have been suggested to execute ML workloads on a
DBMS. However, most of them have looked at in-DBMS ML from a training perspective …

被引用次数：2 相关文章所有 9 个版本

[PDF] nsf.gov

Chasing similarity: Distribution-aware aggregation scheduling

F Liu, A Salmasi, S Blanas, A Sidiropoulos - Proceedings of the VLDB …, 2018 - dl.acm.org

Parallel aggregation is a ubiquitous operation in data analytics that is expressed as GROUP
BY in SQL, reduce in Hadoop, or segment in TensorFlow. Parallel aggregation starts with an …

被引用次数：13 相关文章所有 6 个版本

[PDF] arxiv.org

SimFS: a simulation data virtualizing file system interface

S Di Girolamo, P Schmid… - 2019 IEEE …, 2019 - ieeexplore.ieee.org

Nowadays simulations can produce petabytes of data to be stored in parallel filesystems or
large-scale databases. This data is accessed over the course of decades often by thousands …

被引用次数：8 相关文章所有 33 个版本

[PDF] hal.science

Asynchronous Multi-Level Checkpointing: An Enabler of Reproducibility using Checkpoint History Analytics

K Assogba, B Nicolae, H Van Dam… - … of the SC'23 Workshops of …, 2023 - dl.acm.org

High-performance computing applications are increasingly integrating checkpointing
libraries for reproducibility analytics. However, capturing an entire checkpoint history for …

Fast and Effective Compression for IoT Systems

C Liu - 2022 - search.proquest.com

Abstract The Internet of Things (IoT) enables connections of trillions of sensors and data
collection for connectivity and analytics. The amount of IoT-generated data has exploded …

The effect of Computational Environments on Big Data Processing Pipelines in Neuroimaging

MA Salari - 2021 - spectrum.library.concordia.ca

Variations in computational infrastructures, including operating systems, software versions,
and hardware architectures, introduce variability in neuroimaging analyses that could affect …

[PDF] arxiv.org

Chasing Similarity: Distribution-aware Aggregation Scheduling (Extended Version)

F Liu, A Salmasi, S Blanas, A Sidiropoulos - arXiv preprint arXiv …, 2018 - arxiv.org

Parallel aggregation is a ubiquitous operation in data analytics that is expressed as GROUP
BY in SQL, reduce in Hadoop, or segment in TensorFlow. Parallel aggregation starts with an …

被引用次数：1 相关文章所有 2 个版本

[PDF] ethz.ch

[PDF][PDF] Application-driven network and storage optimizations

S Di Girolamo - 2021 - research-collection.ethz.ch

During the last few decades, we transitioned into the data-driven era, where scientific
models are being computed on supercomputers and large datacenters. The …

高级搜索

QQ 群