Fast in-memory transaction processing using RDMA and HTM

X Wei, J Shi, Y Chen, R Chen, H Chen - Proceedings of the 25th …, 2015 - dl.acm.org
We present DrTM, a fast in-memory transaction processing system that exploits advanced
hardware features (ie, RDMA and HTM) to improve latency and throughput by over one …

A survey of parallel programming models and tools in the multi and many-core era

J Diaz, C Munoz-Caro, A Nino - IEEE Transactions on parallel …, 2012 - ieeexplore.ieee.org
In this work, we present a survey of the different parallel programming models and tools
available today with special consideration to their suitability for high-performance …

Scale-out NUMA

S Novakovic, A Daglis, E Bugnion, B Falsafi… - ACM SIGPLAN …, 2014 - dl.acm.org
Emerging datacenter applications operate on vast datasets that are kept in DRAM to
minimize latency. The large number of servers needed to accommodate this massive …

Exploring traditional and emerging parallel programming models using a proxy application

I Karlin, A Bhatele, J Keasler… - 2013 IEEE 27th …, 2013 - ieeexplore.ieee.org
Parallel machines are becoming more complex with increasing core counts and more
heterogeneous architectures. However, the commonly used parallel programming models …

Efficient distributed memory management with RDMA and caching

Q Cai, W Guo, H Zhang, D Agrawal, G Chen… - Proceedings of the …, 2018 - dl.acm.org
Recent advancements in high-performance networking interconnect significantly narrow the
performance gap between intra-node and inter-node communications, and open up …

An ephemeral burst-buffer file system for scientific applications

T Wang, K Mohror, A Moody, K Sato… - SC'16: Proceedings of …, 2016 - ieeexplore.ieee.org
Burst buffers are becoming an indispensable hardware resource on large-scale
supercomputers to buffer the bursty I/O from scientific applications. However, there is a lack …

A survey on the Distributed Computing stack

C Ramon-Cortes, P Alvarez, F Lordan, J Alvarez… - Computer Science …, 2021 - Elsevier
In this paper, we review the background and the state of the art of the Distributed Computing
software stack. We aim to provide the readers with a comprehensive overview of this area by …

Performance comparison of OpenMP, MPI, and MapReduce in practical problems

SJ Kang, SY Lee, KM Lee - Advances in Multimedia, 2015 - Wiley Online Library
With problem size and complexity increasing, several parallel and distributed programming
models and frameworks have been developed to efficiently handle such problems. This …

PARSECSs: Evaluating the impact of task parallelism in the PARSEC benchmark suite

D Chasapis, M Casas, M Moretó, R Vidal… - ACM Transactions on …, 2015 - dl.acm.org
In this work, we show how parallel applications can be implemented efficiently using task
parallelism. We also evaluate the benefits of such parallel paradigm with respect to other …

Scale-out non-uniform memory access

S Novakovic, A Daglis, BR Grot, E Bugnion… - US Patent …, 2017 - Google Patents
(57) ABSTRACT A computing system that uses a Scale-Out NUMA (" SONUMA”)
architecture, programming model, and/or communication protocol provides for low-latency …