Review of "Tao: Facebook's Distributed Data Store For The Social Graph"

08 Nov 2015

Review of "Tao: Facebook's Distributed Data Store For The Social Graph"

Tao is a distributed cache for the social graph of Facebook. It's built to replace an old php data accessing interface that was used to render Facebook's website. This interface basically proxied the real mysql database and allows you to retrieve data from memcache for faster reads. Tao is a new system built to replace this somewhat "ad-hoc" system, but provides the same interface for web servers.

For Facebook, there are about 99.8% reads and 0.2% writes. This character of IO makes it crucial to have highly optimized reads whereas not that much important for writes. Tao acts as a write-through cache for the underlying MySQL, different from a look-aside cache. Write-though cache doesn't have the expensiveness of writes. When you do a write to the datastore, Tao routes that write the master cache and then the master cache updates its own cache and ask other servers to do range refill and then writes this cache to the database, by doing this, there's no cost of reloading the entire cached data from database, which is a huge cost if the cached data structure is huge. And the reason why they don't just resend the data to replica caches is because this makes the cache refill request idempotent and saves bandwidth in cases when the replica already have the data. Writes are sent synchronously in reply to the writer and asynchronously sent to other followers. When a read misses for followers, readers ask leaders to load the data.

Tao also provides high read availability it can endure follower failures by paying potential break of read-after-write consistency and leader failure is tolerated by routing read misses directly to DB (potentially huge load increase to DB?) and writes are routed to another member of leader tier.

Will this project be influential in 10 years? I think so. The giant cache architecture makes serves Facebook's infrastructure really well. Although Facebook has all these intricate relationship of data entities, this system still provides amazingly high performance.