Processing-in-memory: A workload-driven perspective

S Ghose, A Boroumand, JS Kim… - IBM Journal of …, 2019 - ieeexplore.ieee.org
Many modern and emerging applications must process increasingly large volumes of data.
Unfortunately, prevalent computing paradigms are not designed to efficiently handle such …

An efficient hardware supported and parallelization architecture for intelligent systems to overcome speculative overheads

S Kumar, SK Singh, N Aggarwal… - … Journal of Intelligent …, 2022 - Wiley Online Library
In the last few decades, technology advancements have paved the way for the creation of
intelligent and autonomous systems that utilize complex calculations which are both time …

[图书][B] Using OpenMP: portable shared memory parallel programming

B Chapman, G Jost, R Van Der Pas - 2007 - books.google.com
A comprehensive overview of OpenMP, the standard application programming interface for
shared memory parallel computing—a reference for students and professionals." I hope that …

Learning from mistakes: a comprehensive study on real world concurrency bug characteristics

S Lu, S Park, E Seo, Y Zhou - … of the 13th international conference on …, 2008 - dl.acm.org
The reality of multi-core hardware has made concurrent programs pervasive. Unfortunately,
writing correct concurrent programs is difficult. Addressing this challenge requires advances …

Transactional locking II

D Dice, O Shalev, N Shavit - International Symposium on Distributed …, 2006 - Springer
The transactional memory programming paradigm is gaining momentum as the approach of
choice for replacing locks in concurrent programming. This paper introduces the …

Transactional memory: An overview

T Harris, A Cristal, OS Unsal, E Ayguade… - IEEE micro, 2007 - ieeexplore.ieee.org
Writing applications that benefit from the massive computational power of future multicore
chip multiprocessors will not be an easy task for mainstream programmers accustomed to …

LogTM: Log-based transactional memory

KE Moore, J Bobba, MJ Moravan… - … Symposium on High …, 2006 - ieeexplore.ieee.org
Transactional memory (TM) simplifies parallel programming by guaranteeing that
transactions appear to execute atomically and in isolation. Implementing these properties …

CoNDA: Efficient cache coherence support for near-data accelerators

A Boroumand, S Ghose, M Patel, H Hassan… - Proceedings of the 46th …, 2019 - dl.acm.org
Specialized on-chip accelerators are widely used to improve the energy efficiency of
computing systems. Recent advances in memory technology have enabled near-data …

Optimistic parallelism requires abstractions

M Kulkarni, K Pingali, B Walter… - Proceedings of the 28th …, 2007 - dl.acm.org
Irregular applications, which manipulate large, pointer-based data structures like graphs, are
difficult to parallelize manually. Automatic tools and techniques such as restructuring …

AVIO: detecting atomicity violations via access interleaving invariants

S Lu, J Tucek, F Qin, Y Zhou - ACM SIGOPS Operating Systems Review, 2006 - dl.acm.org
Concurrency bugs are among the most difficult to test and diagnose of all software bugs. The
multicore technology trend worsens this problem. Most previous concurrency bug detection …