Scaling distributed machine learning with the parameter server

M Li, DG Andersen, JW Park, AJ Smola… - … USENIX Symposium on …, 2014 - usenix.org
We propose a parameter server framework for distributed machine learning problems. Both
data and workloads are distributed over worker nodes, while the server nodes maintain …

{NetChain}:{Scale-Free}{Sub-RTT} coordination

X Jin, X Li, H Zhang, N Foster, J Lee, R Soulé… - … USENIX Symposium on …, 2018 - usenix.org
Coordination services are a fundamental building block of modern cloud systems, providing
critical functionalities like configuration management and distributed locking. The major …

Communication efficient distributed machine learning with the parameter server

M Li, DG Andersen, AJ Smola… - Advances in Neural …, 2014 - proceedings.neurips.cc
This paper describes a third-generation parameter server framework for distributed machine
learning. This framework offers two relaxations to balance system performance and …

Consistency-based service level agreements for cloud storage

DB Terry, V Prabhakaran, R Kotla… - Proceedings of the …, 2013 - dl.acm.org
Choosing a cloud storage system and specific operations for reading and writing data
requires developers to make decisions that trade off consistency for availability and …

Hyperloop: group-based NIC-offloading to accelerate replicated transactions in multi-tenant storage systems

D Kim, A Memaripour, A Badam, Y Zhu, HH Liu… - Proceedings of the …, 2018 - dl.acm.org
Storage systems in data centers are an important component of large-scale online services.
They typically perform replicated transactional operations for high data availability and …

RAMBDA: RDMA-driven Acceleration Framework for Memory-intensive µs-scale Datacenter Applications

Y Yuan, J Huang, Y Sun, T Wang… - … Symposium on High …, 2023 - ieeexplore.ieee.org
Responding to the" datacenter tax" and" killer microseconds" problems for memory-intensive
datacenter applications, diverse solutions including Smart NIC-based ones have been …

Atomic in-place updates for non-volatile main memories with kamino-tx

A Memaripour, A Badam, A Phanishayee… - Proceedings of the …, 2017 - dl.acm.org
Data structures for non-volatile memories have to be designed such that they can be
atomically modified using transactions. Existing atomicity methods require data to be copied …

Harmonia: Near-linear scalability for replicated storage with in-network conflict detection

H Zhu, Z Bai, J Li, E Michael, D Ports, I Stoica… - arXiv preprint arXiv …, 2019 - arxiv.org
Distributed storage employs replication to mask failures and improve availability. However,
these systems typically exhibit a hard tradeoff between consistency and performance …

Forwarding element with a data plane load balancer

J Lee, C Kim - US Patent 10,158,573, 2018 - Google Patents
Some embodiments of the invention provide a forwarding element that has a data-plane
circuit (data plane) that can be configured to implement one or more load balancers. The …

Fault tolerant service function chaining

M Ghaznavi, E Jalalpour, B Wong, R Boutaba… - Proceedings of the …, 2020 - dl.acm.org
Network traffic typically traverses a sequence of middleboxes forming a service function
chain, or simply a chain. Tolerating failures when they occur along chains is imperative to …