Transient-Execution Attacks: A Computer Architect Perspective

L Fiolhais, L Sousa - ACM Computing Surveys, 2023 - dl.acm.org
Computer architects employ a series of performance optimizations at the micro-architecture
level. These optimizations are meant to be invisible to the programmer but they are implicitly …

BulkSC: Bulk enforcement of sequential consistency

L Ceze, J Tuck, P Montesinos, J Torrellas - Proceedings of the 34th …, 2007 - dl.acm.org
While Sequential Consistency (SC) is the most intuitive memory consistency model and the
one most programmers likely assume, current multiprocessors do not support it. Instead …

Continual flow pipelines

ST Srinivasan, R Rajwar, H Akkary, A Gandhi… - ACM SIGARCH …, 2004 - dl.acm.org
Increased integration in the form of multiple processor cores on a single die, relatively
constant die sizes, shrinking power envelopes, and emerging applications create a new …

Rock: A high-performance sparc cmt processor

S Chaudhry, R Cypher, M Ekman, M Karlsson… - IEEE micro, 2009 - ieeexplore.ieee.org
Rock, Sun's third-generation chip-multithreading processor, contains 16 high-performance
cores, each of which can support two software threads. Rock uses a novel checkpoint-based …

Data-centric computing frontiers: A survey on processing-in-memory

P Siegl, R Buchty, M Berekovic - Proceedings of the Second …, 2016 - dl.acm.org
A major shift from compute-centric to data-centric computing systems can be perceived, as
novel big data workloads like cognitive computing and machine learning strongly enforce …

High-performance throughput computing

S Chaudhry, P Caprioli, S Yip, M Tremblay - IEEE Micro, 2005 - ieeexplore.ieee.org
CMT processors offer a way to significantly improve the performance of computer systems.
The return on investment for multithreading is among the highest in computer …

Scalable cache miss handling for high memory-level parallelism

J Tuck, L Ceze, J Torrellas - 2006 39th Annual IEEE/ACM …, 2006 - ieeexplore.ieee.org
Recently-proposed processor microarchitectures for high memory level parallelism (MLP)
promise substantial performance gains. Unfortunately, current cache hierarchies have miss …

Dual-core execution: Building a highly scalable single-thread instruction window

H Zhou - … International Conference on Parallel Architectures and …, 2005 - ieeexplore.ieee.org
Current integration trends embrace the prosperity of single-chip multi-core processors.
Although multi-core processors deliver significantly improved system throughput, single …

Checkpointed early load retirement

N Kirman, M Kirman, M Chaudhuri… - … Symposium on High …, 2005 - ieeexplore.ieee.org
Long-latency loads are critical in today's processors due to the ever-increasing speed gap
with memory. Not only do these loads block the execution of dependent instructions, they …

A family of mechanisms for congestion control in wormhole networks

E Baydal, P Lopez, J Duato - IEEE Transactions on Parallel …, 2005 - ieeexplore.ieee.org
Multiprocessor interconnection networks may reach congestion with high traffic loads, which
prevents reaching the wished performance. Unfortunately, many of the mechanisms …