A survey on agent-based simulation using hardware accelerators

J Xiao, P Andelfinger, D Eckhoff, W Cai… - ACM Computing Surveys …, 2019 - dl.acm.org
Due to decelerating gains in single-core CPU performance, computationally expensive
simulations are increasingly executed on highly parallel hardware platforms. Agent-based …

Taming the zoo: The unified graphit compiler framework for novel architectures

A Brahmakshatriya, E Furst, VA Ying… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org
We live in a new Cambrian Explosion of hardware devices. The end of conventional
processor scaling has driven research and industry practice to explore a new generation of …

From describing to prescribing parallelism: Translating the SPEC ACCEL OpenACC suite to OpenMP target directives

G Juckeland, O Hernandez, AC Jacob… - … Conference on High …, 2016 - Springer
Current and next generation HPC systems will exploit accelerators and self-hosting devices
within their compute nodes to accelerate applications. This comes at a time when …

Restricted Boltzmann machines for recommender systems with implicit feedback

F Yang, Y Lu - 2018 IEEE International Conference on Big Data …, 2018 - ieeexplore.ieee.org
Implicit feedback such as video watch time is commonly seen in many internet products.
Though recommender systems with explicit feedback have been abundantly researched …

A high performance real-time interferometry sensor system architecture

T Hussain, S Amin, U Zabit, OD Bernal… - Microprocessors and …, 2019 - Elsevier
Optical feedback or self-mixing interferometry technique has been widely used for sensing
vibration, displacement, velocity, distance and flow applications. Such applications require …

Domain-decomposition parallelization for molecular dynamics algorithm with short-ranged potentials on epiphany architecture

V Nikolskii, V Stegailov - Lobachevskii Journal of Mathematics, 2018 - Springer
Many-core processor architecture is a promising paradigm for the development of modern
supercomputers. In this paper, we consider the parallel implementation of the generic …

[图书][B] Code Generation and Optimization of Graph Programs on a Manycore Architecture

E Furst - 2021 - search.proquest.com
Graph processing is an area of increasing importance in domains such as networking, super
computing, public health, and more. However, large scale graph processing presents many …

A high performance real-time FGPA-based interferometry sensor architecture

T Hussain, S Amin, U Zabit, F Kamran… - … on Frontiers of …, 2016 - ieeexplore.ieee.org
Optical feedback or self-mixing interferometry technique has been widely used for sensing
vibration, displacement, velocity, distance and flow applications. Such applications require …

Bulk-synchronous pseudo-streaming algorithms for many-core accelerators

JW Buurlage, T Bannink, A Wits - arXiv preprint arXiv:1608.07200, 2016 - arxiv.org
The bulk-synchronous parallel (BSP) model provides a framework for writing parallel
programs with predictable performance. In this paper we extend the BSP model to support …

Compiler-assisted, adaptive runtime system for the support of OpenMP in embedded multicores

SN Agathos, VV Dimakopoulos, IK Kasmeridis - Parallel Computing, 2022 - Elsevier
The latest versions of OpenMP have introduced constructs for exploiting heterogeneous
compute units alongside the main multicore cpu. The offloaded program portions (kernels) …