Zero-shot kernel learning

H Zhang, P Koniusz - … of the IEEE conference on computer …, 2018 - openaccess.thecvf.com
In this paper, we address an open problem of zero-shot learning. Its principle is based on
learning a mapping that associates feature vectors extracted from ie images and attribute …

DAFT: Decoupled acyclic fault tolerance

Y Zhang, JW Lee, NP Johnson, DI August - Proceedings of the 19th …, 2010 - dl.acm.org
Higher transistor counts, lower voltage levels, and reduced noise margin increase the
susceptibility of multicore processors to transient faults. Redundant hardware modules can …

An efficient unbounded lock-free queue for multi-core systems

M Aldinucci, M Danelutto, P Kilpatrick… - Euro-Par 2012 Parallel …, 2012 - Springer
The use of efficient synchronization mechanisms is crucial for implementing fine grained
parallel programs on modern shared cache multi-core architectures. In this paper we study …

DRAMHiT: A Hash Table Architected for the Speed of DRAM

V Narayanan, D Detweiler, T Huang… - Proceedings of the …, 2023 - dl.acm.org
Despite decades of innovation, existing hash tables fail to achieve peak performance on
modern hardware. Built around a relatively simple computation, ie, a hash function, which in …

COMET: Communication-optimised multi-threaded error-detection technique

K Mitropoulou, V Porpodas, TM Jones - Proceedings of the International …, 2016 - dl.acm.org
Relentless technology scaling has made transistors more vulnerable to soft, or transient,
errors. To keep systems robust against these, current error detection techniques use …

Automatically exploiting cross-invocation parallelism using runtime information

J Huang, TB Jablin, SR Beard… - Proceedings of the …, 2013 - ieeexplore.ieee.org
Automatic parallelization is a promising approach to producing scalable multi-threaded
programs for multicore architectures. Many existing automatic techniques only parallelize …

PROMPT: A Fast and Extensible Memory Profiling Framework

Z Xu, Y Chon, Y Su, Z Tan, S Apostolakis… - Proceedings of the …, 2024 - dl.acm.org
Memory profiling captures programs' dynamic memory behavior, assisting programmers in
debugging, tuning, and enabling advanced compiler optimizations like speculation-based …

Cache‐aware design of general‐purpose Single‐Producer–Single‐Consumer queues

V Maffione, G Lettieri, L Rizzo - Software: Practice and …, 2019 - Wiley Online Library
Data processing pipelines normally use lockless Single‐Producer–Single‐Consumer
(SPSC) queues to efficiently decouple their processing threads and achieve high …

B-queue: Efficient and practical queuing for fast core-to-core communication

J Wang, K Zhang, X Tang, B Hua - International Journal of Parallel …, 2013 - Springer
Core-to-core communication is critical to the effective use of multi-core processors. A
number of software based concurrent lock-free queues have been proposed to address this …

Equeue: Elastic lock-free fifo queue for core-to-core communication on multi-core processors

J Wang, Y Tian, X Fu - IEEE Access, 2020 - ieeexplore.ieee.org
In recent years, the number of CPU cores in a multi-core processor keeps increasing. To
leverage the increasing hardware resource, programmers need to develop parallelized …