Review of "pFabric: Minimal Near-Optimal Datacenter Transport"

18 Nov 2015

Review of "pFabric: Minimal Near-Optimal Datacenter Transport"

Interactive soft real-time workloads such as the ones seen in search, social networking generates a lot of small requests and responses across data centers. Theoretically these requests could be finished in 10-20 microseconds, but in TCP-based fabrics, the latency ca be as high as tens of milliseconds. The reason is these short flows gets queued by co-existing large flows caused by backup, replication, data mining, etc.

So this paper proposes pFabric under this observation. Key points of this paper is based on the opinion that flow scheduling should be decoupled from rate control. they are 1) end-hosts put a single number indicating the priority of the packet. It can be remaining size or completion deadline; 2) Switches choose packet send and drop strictly according to priority. 3) Rate control is minimal. All flows start at line rate and throttle sending rate only when they see high and persistent loss.

Will this paper be influential in 10 years? Maybe. It does provide an simple solution to the current networking stack. But it requires rewriting of the host networking stack as well as switches. The deployment cost might be too high for it to go popular. As long as the manufacturing side keeps up, this should be a great solution.