Sailfish: Accelerating cloud-scale multi-tenant multi-service gateways with programmable switches

T Pan, N Yu, C Jia, J Pi, L Xu, Y Qiao, Z Li… - Proceedings of the …, 2021 - dl.acm.org
The cloud gateway is essential in the public cloud as the central hub of cloud traffic. We
show that horizontal scaling of software gateways, once sustainable for years, is no longer …

MimicNet: Fast performance estimates for data center networks with machine learning

Q Zhang, KKW Ng, C Kazer, S Yan, J Sedoc… - Proceedings of the 2021 …, 2021 - dl.acm.org
At-scale evaluation of new data center network innovations is becoming increasingly
intractable. This is true for testbeds, where few, if any, can afford a dedicated, full-scale …

Meissa: Scalable network testing for programmable data planes

N Zheng, M Liu, E Zhai, HH Liu, Y Li, K Yang… - Proceedings of the …, 2022 - dl.acm.org
Ensuring the correctness of programmable data planes is important. Testing offers
comprehensive correctness checking, including detecting both code bugs and non-code …

Automated verification of network function binaries

S Pirelli, A Valentukonytė, K Argyraki… - 19th USENIX Symposium …, 2022 - usenix.org
-1emAutomated Verification of Network Function Binaries Page 1 Open access to the
Proceedings of the 19th USENIX Symposium on Networked Systems Design and …

Hydra: Effective Runtime Network Verification

S Renganathan, B Rubin, H Kim, PL Ventre… - Proceedings of the …, 2023 - dl.acm.org
It is notoriously difficult to verify that a network is behaving as intended, especially at scale.
This paper presents Hydra, a system that uses ideas from runtime verification to check that …

Demystifying and checking silent semantic violations in large distributed systems

C Lou, Y Jing, P Huang - … Symposium on Operating Systems Design and …, 2022 - usenix.org
Distributed systems today offer rich features with numerous semantics that users depend on.
Bugs can cause a system to silently violate its semantics without apparent anomalies. Such …

Zoonet: a proactive telemetry system for large-scale cloud networks

S Zhu, J Lu, B Lyu, T Pan, C Jia, X Cheng… - Proceedings of the 18th …, 2022 - dl.acm.org
We present Zoonet, a proactive virtual network telemetry system for multi-tenant clouds. The
requirements are to (1) cover hyper-scale virtual networks with millions of tenants and …

Beaver: Practical Partial Snapshots for Distributed Cloud Services

L Yu, X Zhang, H Zhang, J Sonchack, D Ports… - … USENIX Symposium on …, 2024 - usenix.org
Distributed snapshots are a classic class of protocols used for capturing a causally
consistent view of states across machines. Although effective, existing protocols presume an …

Sequence Abstractions for Flexible,{Line-Rate} Network Monitoring

A Johnson, R Beckett, X Chen, R Mahajan… - … USENIX Symposium on …, 2024 - usenix.org
We develop FLM, a high-level language that enables network operators to write programs
that recognize and react to specific packet sequences. To be able to examine every packet …

Proactive Telemetry in Large-Scale Multi-Tenant Cloud Overlay Networks

S Zhu, J Lu, B Lyu, T Pan, S Zhang… - IEEE/ACM …, 2024 - ieeexplore.ieee.org
At present, public clouds have served millions of tenants. To provide reliable services, cloud
vendors need to perceive health status of the cloud network by building a telemetry system …