Liberty queues for epic architectures

H Zhang, P Koniusz - … of the IEEE conference on computer …, 2018 - openaccess.thecvf.com

In this paper, we address an open problem of zero-shot learning. Its principle is based on
learning a mapping that associates feature vectors extracted from ie images and attribute …

被引用次数：122 相关文章所有 11 个版本

[PDF] psu.edu

DAFT: Decoupled acyclic fault tolerance

Y Zhang, JW Lee, NP Johnson, DI August - Proceedings of the 19th …, 2010 - dl.acm.org

Higher transistor counts, lower voltage levels, and reduced noise margin increase the
susceptibility of multicore processors to transient faults. Redundant hardware modules can …

被引用次数：115 相关文章所有 17 个版本

[PDF] unito.it

An efficient unbounded lock-free queue for multi-core systems

M Aldinucci, M Danelutto, P Kilpatrick… - Euro-Par 2012 Parallel …, 2012 - Springer

The use of efficient synchronization mechanisms is crucial for implementing fine grained
parallel programs on modern shared cache multi-core architectures. In this paper we study …

被引用次数：92 相关文章所有 14 个版本

[PDF] acm.org

DRAMHiT: A Hash Table Architected for the Speed of DRAM

V Narayanan, D Detweiler, T Huang… - Proceedings of the …, 2023 - dl.acm.org

Despite decades of innovation, existing hash tables fail to achieve peak performance on
modern hardware. Built around a relatively simple computation, ie, a hash function, which in …

被引用次数：3 相关文章所有 4 个版本

[PDF] cam.ac.uk

COMET: Communication-optimised multi-threaded error-detection technique

K Mitropoulou, V Porpodas, TM Jones - Proceedings of the International …, 2016 - dl.acm.org

Relentless technology scaling has made transistors more vulnerable to soft, or transient,
errors. To keep systems robust against these, current error detection techniques use …

被引用次数：30 相关文章所有 8 个版本

[PDF] princeton.edu

Automatically exploiting cross-invocation parallelism using runtime information

J Huang, TB Jablin, SR Beard… - Proceedings of the …, 2013 - ieeexplore.ieee.org

Automatic parallelization is a promising approach to producing scalable multi-threaded
programs for multicore architectures. Many existing automatic techniques only parallelize …

被引用次数：48 相关文章所有 17 个版本

[PDF] acm.org

PROMPT: A Fast and Extensible Memory Profiling Framework

Z Xu, Y Chon, Y Su, Z Tan, S Apostolakis… - Proceedings of the …, 2024 - dl.acm.org

Memory profiling captures programs' dynamic memory behavior, assisting programmers in
debugging, tuning, and enabling advanced compiler optimizations like speculation-based …

Cache‐aware design of general‐purpose Single‐Producer–Single‐Consumer queues

V Maffione, G Lettieri, L Rizzo - Software: Practice and …, 2019 - Wiley Online Library

Data processing pipelines normally use lockless Single‐Producer–Single‐Consumer
(SPSC) queues to efficiently decouple their processing threads and achieve high …

被引用次数：13 相关文章所有 2 个版本

[PDF] github.io

B-queue: Efficient and practical queuing for fast core-to-core communication

J Wang, K Zhang, X Tang, B Hua - International Journal of Parallel …, 2013 - Springer

Core-to-core communication is critical to the effective use of multi-core processors. A
number of software based concurrent lock-free queues have been proposed to address this …

被引用次数：23 相关文章所有 11 个版本

[PDF] ieee.org

Equeue: Elastic lock-free fifo queue for core-to-core communication on multi-core processors

J Wang, Y Tian, X Fu - IEEE Access, 2020 - ieeexplore.ieee.org

In recent years, the number of CPU cores in a multi-core processor keeps increasing. To
leverage the increasing hardware resource, programmers need to develop parallelized …

被引用次数：10 相关文章所有 3 个版本

高级搜索

QQ 群