Differentiable sorting networks for scalable sorting and ranking supervision

F Petersen, C Borgelt, H Kuehne… - … on Machine Learning, 2021 - proceedings.mlr.press
Sorting and ranking supervision is a method for training neural networks end-to-end based
on ordering constraints. That is, the ground truth order of sets of samples is known, while …

Fine-grained multi-query stream processing on integrated architectures

F Zhang, C Zhang, L Yang, S Zhang… - … on Parallel and …, 2021 - ieeexplore.ieee.org
Exploring the sharing opportunities among multiple stream queries is crucial for high-
performance stream processing. Modern stream processing necessitates accelerating …

A Survey on Heterogeneous CPU–GPU Architectures and Simulators

M Alaei, F Yazdanpanah - Concurrency and Computation …, 2025 - Wiley Online Library
Heterogeneous architectures are vastly used in various high performance computing
systems from IoT‐based embedded architectures to edge and cloud systems. Although …

HCE: a runtime system for efficiently supporting heterogeneous cooperative execution

L Wan, W Zheng, X Yuan - IEEE Access, 2021 - ieeexplore.ieee.org
Heterogeneous systems with multiple different compute devices have come into common
use recently, and the heterogeneity of the compute device is mainly reflected in three …

Parallel Block-InsertionSort

JL Vásquez, H Ferrada… - 2023 IEEE CHILEAN …, 2023 - ieeexplore.ieee.org
In this work, we design a parallel algorithm of the Block-InsertionSort (BiS) method by taking
advantage of the high degree of parallelization that BiS offers, which performs multiple …

Fast period searches using the Lomb–Scargle algorithm on Graphics Processing Units for large datasets and real-time applications

M Gowanlock, D Kramer, DE Trilling, NR Butler… - Astronomy and …, 2021 - Elsevier
Computing the periods of variable objects is well-known to be computationally expensive.
Modern astronomical catalogs contain a significant number of observed objects. Therefore …

Studies of an event-building algorithm of the readout system for the twin TPCs in HFRS

J Tian, ZP Sun, SB Chang, Y Qian, HY Zhao… - Nuclear Science and …, 2024 - Springer
Abstract The High-energy Fragment Separator (HFRS), which is currently under
construction, is a leading international radioactive beam device. Multiple sets of position …

GPU-based fast clustering via K-Centres and k-NN mode seeking for geospatial industry applications

AL Uribe-Hurtado, M Orozco-Alzate, N Lopes… - Computers in …, 2020 - Elsevier
The emerging trends in data industry, particularly those related to the repeated processing of
data streams, are pushing the limits of computer systems and processes. Among them, the …

Accelerating the unacceleratable: Hybrid CPU/GPU algorithms for memory-bound database primitives

M Gowanlock, B Karsin, Z Fink, J Wright - Proceedings of the 15th …, 2019 - dl.acm.org
Many database operations have a low compute to memory access ratio. In heterogeneous
systems, where a graphics processing unit (GPU) is interconnected via PCIe, the data …

Multithreaded applications on the heterogeneous research computing environment.

S Jung - 2024 - ir.library.louisville.edu
Bioinformatics is a domain that has experienced rapid research growth in recent years, as
evidenced by the increasing number of articles in biomedical databases such as PubMed …