Adapt: Fast emulation of approximate dnn accelerators in pytorch

D Danopoulos, G Zervakis, K Siozios… - … on Computer-Aided …, 2022 - ieeexplore.ieee.org
Current state-of-the-art employs approximate multipliers to address the highly increased
power demands of deep neural network (DNN) accelerators. However, evaluating the …

Very Fast Tree: speeding up the estimation of phylogenies for large alignments through parallelization and vectorization strategies

C Piñeiro, JM Abuín, JC Pichel - Bioinformatics, 2020 - academic.oup.com
Motivation FastTree-2 is one of the most successful tools for inferring large phylogenies.
With speed at the core of its design, there are still important issues in the FastTree-2 …

A fast and efficient SIMD track reconstruction algorithm for the LHCb upgrade 1 VELO-PIX detector

A Hennequin, B Couturier, VV Gligorov… - Journal of …, 2020 - iopscience.iop.org
The upgraded CERN LHCb detector, due to start data taking in 2021, will have to reconstruct
4 TB/s of raw detector data in real time using commodity processors. This is one of the …

RabbitQCPlus 2.0: More efficient and versatile quality control for sequencing data

L Yan, Z Yin, H Zhang, Z Zhao, M Wang, A Müller… - Methods, 2023 - Elsevier
Assessing the quality of sequencing data plays a crucial role in downstream data analysis.
However, existing tools often achieve sub-optimal efficiency, especially when dealing with …

High-Performance FFT Code Generation via MLIR Linalg Dialect and SIMD Micro-Kernels

Y He, S Markidis - 2024 IEEE International Conference on …, 2024 - ieeexplore.ieee.org
Fast Fourier Transform (FFT) libraries are an indispensable and critical component of any
High-Performance Computing (HPC) software stack. They are used in many applications …

RabbitQCPlus: More Efficient Quality Control for Sequencing Data

L Yan, Z Yin, H Zhang, Z Zhao, M Wang… - 2022 IEEE …, 2022 - ieeexplore.ieee.org
Assessing the quality of sequencing data plays a crucial role in downstream data analysis.
However, existing tools often achieve sub-optimal efficiency, especially when dealing with …

[PDF][PDF] Vector length agnostic SIMD parallelism on modern processor architectures with the focus on Arm's SVE

B Brank - 2023 - elekpub.bib.uni-wuppertal.de
High-Performance Computing (HPC) has seen a substantial increase in computing power
over the recent decade. In June 2008, the first petascale system was introduced, which …

Performance optimization for the LHCb experiment

A Hennequin - 2022 - theses.hal.science
The LHCb experiment, at CERN, is preparing a major upgrade of its detector and a change
from an hardware-based to a fully software-based trigger system. It is now facing the …

Empirical study of Amdahl's law on multicore processors

C Bruns, S Touati - 2019 - inria.hal.science
Since several years, classical multiprocessor systems have evolved to multicores, which
tightly integrate multiple CPU cores on a single die or package. This shift does not modify …

[PDF][PDF] Faires Scheduling unter Beachtung von AVX-512-Frequenzeffekten

P Machauer - Bachelor Thesis. Operating Systems Group, Karlsruhe …, 2020 - os.itec.kit.edu
Abstract Intel introduced the Advanced Vector Extensions (AVX) to their processors for
making complex calculations faster. Those instructions lead to a higher power usage, thus …