A unified optimization approach for cnn model inference on integrated gpus

L Wang, Z Chen, Y Liu, Y Wang, L Zheng, M Li… - Proceedings of the 48th …, 2019 - dl.acm.org
Modern deep learning applications urge to push the model inference taking place at the
edge devices for multiple reasons such as achieving shorter latency, relieving the burden of …

Fast in-place suffix sorting on a multicore computer

B Lao, G Nong, WH Chan, JY Xie - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
Sorting all suffixes of an input string will produce the suffix array that is a fundamental data
structure for full-text search on. To utilize the parallel computing power of a multicore …

Fast induced sorting suffixes on a multicore machine

B Lao, G Nong, WH Chan, Y Pan - The Journal of Supercomputing, 2018 - Springer
Sorting the suffixes of an input string is a fundamental task in many applications such as
data compression, genome alignment, and full-text search. The induced sorting (IS) method …

Massively parallel inverse block-sorting transforms for bzip2 decompression on GPUs

A Weißenberger, B Schmidt - … of the 53rd International Conference on …, 2024 - dl.acm.org
Lossless data compression has evolved into an indispensable tool for reducing data transfer
times in heterogeneous systems. However, performing decompression on host systems can …

Tunnel: Parallel-inducing sort for large string analytics

Z Du, S Zhang, DA Bader - Future Generation Computer Systems, 2023 - Elsevier
The suffix array is a crucial data structure for efficient string analysis. Over the course of
twenty-six years, sequential suffix array construction algorithms have achieved O (n) time …

Scalable string and suffix sorting: Algorithms, techniques, and tools

T Bingmann - arXiv preprint arXiv:1808.00963, 2018 - arxiv.org
This dissertation focuses on two fundamental sorting problems: string sorting and suffix
sorting. The first part considers parallel string sorting on shared-memory multi-core …

Gossip: Efficient communication primitives for multi-gpu systems

R Kobus, D Jünger, C Hundt, B Schmidt - Proceedings of the 48th …, 2019 - dl.acm.org
Nowadays, a growing number of servers and workstations feature an increasing number of
GPUs. However, slow communication among GPUs can lead to poor application …

Distributed enhanced suffix arrays: efficient algorithms for construction and querying

P Flick, S Aluru - Proceedings of the International Conference for High …, 2019 - dl.acm.org
Suffix arrays and trees are important and fundamental string data structures which lie at the
foundation of many string algorithms, with important applications in computational biology …

Parallel suffix sorting for large string analytics

Z Du, S Zhang, DA Bader - International Conference on Parallel …, 2022 - Springer
The suffix array is a fundamental data structure to support string analysis efficiently. It took
about 26 years for the sequential suffix array construction algorithm to achieve O (n) time …

SACABench: Benchmarking suffix array construction

J Bahne, N Bertram, M Böcker, J Bode… - String Processing and …, 2019 - Springer
We present a practical comparison of suffix array construction algorithms on modern
hardware. The benchmark is conducted using our new benchmark framework SACABench …