Frankly Speaking, 3/19/19 -- Datacenter RPCs can be general and fast
A weekly(-ish) newsletter on random thoughts in tech and research. I am an investor at Dell Technologies Capital and a recovering academic. I am interested in security, blockchain, and devops.
I'm excited for the first installment of my newsletter! If you were forwarded this email, you can subscribe here. Email me at frank.y.wang@dell.com with issues and/or feedback.
WEEKLY TECH THOUGHT
This week, I blogged about eRPC, datacenter RPCs that are both general and fast. This project won best paper at this year’s NSDI, the premier academic networking and systems conference. It is joint work between CMU and Intel Labs. Here is the full paper, and my full blog post.
Problem
Modern datacenter networks are fast. They are 100 Gbps, 2 µs RTT under one switch, and 300 ns per switch hop. Existing networking options sacrifice performance or generality. TCP and gRPC are general but slow. DPDK and RDMA make simplifying assumptions, which makes them specialized and fast.
Solution
They develop eRPC, which provides both speed and generality. There are three main challenges.
Managing packet loss
Low-overhead transport
Easy integration for existing applications
How do they manage packet loss? The problem is that there are millisecond timeouts for small RPCs.

Hardware solutions involve lossless link layers (e.g. PFC, InfiniBand). Although they provide simple/cheap reliability. They are prone to deadlocks and unfairness. eRPC’s solution is a relaxed requirement for rare loss, supported by existing networks. In low-latency networks, switch buffers prevent most loss. All modern switches have buffers >> BDP. A small BDP + sufficient switch buffer leads to rare loss.
How do they create a low-overhead transport layer? The idea is to optimize for the common case, e.g. optimized DMA buffer management for rare packet loss or optimized congestion control for uncongested networks. There are many more examples in the paper.
The solution is to use the server’s response in common case, and flush DMA queue during rare loss.
Another example is efficient congestion control in software. There is overhead to congestion control, e.g. rate limiter overhead. eRPC’s solution is to optimize for uncongested networks. Datacenter networks are usually uncongested. Common-case optimizations matter as seen in the graph below.
The result is low overhead transport with congestion control.

How do they easily integrate with existing applications? Replication over eRPC is fast. Raft-over-eRPC does not have network or object size constraints.
This is a cool project with a lot of core principles about how to provide general and fast datacenter RPCs. Given fast packet I/O, they can provide fast networking in software. If you want to learn more about eRPC, here is the landing page.
WEEKLY TWEET

WEEKLY FRANK THOUGHT
VC and academia are strangely similar if you really think about it.
- VCs and academics both need to raise money. VCs have LPs. Professors have grant agencies and companies.
- Professors are part of a department and have students who work on projects. VCs are part of a partnership and have startup founders who work on companies.
- Professors fund students. VCs fund startup founders.
- Both fund risky ideas and visions.
So, why aren't more academics and PhDs in VC?
FUN NEWS & LINKS
#securityvclogic
"Gartner came out with their Top 10 security projects for 2019. We need to invest in a company in each project category." #securityvclogic
#delltechcapital
Dell Tech Capital leads 12.5M Series A into Tetrate*, an enterprise-grade service mesh company.
Dell Tech Capital 2018 Year in Review.
RSA and RiskRecon* announce partnership.
#research
Harnessing Organizational Knowledge for Machine Learning (Google, Stanford, and Brown).
Holoclean: A Machine Learning System for Data Enrichment
CrashMonkey: Finding Crash-Consistency Bugs with Bounded Black-Box Crash Testing
#tech, #security, #vclife
Leaked sensitive data due to misconfigured Box accounts.
Don't over-optimize fundraising.
Crypto algorithm and key sizes combos matter more than just key sizes.
Beto used to be a hacker!
"The number of Bitcoin podcasts and channels will exceed the number of circulating Bitcoin."