The Modern Stack of Facebook.com 📚
I loved Facebook during my childhood. Today, people from my generation agree to call it the worst social network, still it remains the most popular social network in the world. With 2 billion active users daily, Facebook is “not dead, nor dying” according to its director (March 2023).
Launched on February 4, 2004, by Mark Zuckerberg, or “the Zucc” as we love to call him, Facebook started as a PHP monolith coded outside his creator’s class hours. As they couldn’t anticipate the success of the social network, its code was readjusted, and the infrastructure modernized.
Today, Facebook.com celebrates its 20th anniversary, and I wanted to pay tribute to this great site of my era that I still decided to leave in 2022 at the end of my studies. Here’s a summary of its technical stack, which Facebook discusses in its blog Engineering at Meta, and that I decided to condense for you in this small article that I really loved making.
Homebrewed Frontend
If ReactJS is used by so many, it’s because everyone wants to become Facebook. Facebook’s teams launched in 2013 the trendy JS framework despite its decline. It’s now the framework used by the new Facebook website and apps.
GraphQL is what they use for their API requests. Like React, GraphQL was developed in 2015 by Facebook and is now open-source. The problem with traditional APIs is that they return too much data compared to what is requested, a problem that GraphQL tackles.
Finally, Relay, less popular, manages data dynamically in React. The idea behind this framework is that data needs to be hierarchical and React components need specific data. Relay then queries GraphQL only for what’s interesting, while also prioritizing content.
Facebook.com’s backend : from PHP to PHP (but better)
I remember a time when Facebook was regularly unavailable, and I couldn’t post my mediocre jokes to my internet friends. I believe this problem is definitively solved today, and I understand why.
Despite the popularity of Kubernetes, Facebook chose to use an in-house orchestrator, Twine, about which not much is known as it is closed source. It’s the new name for Tupperware, who is managing Facebook’s workloads and containers since the past decade.
In terms of programming language, Zucc started with PHP. Then, in 2014, Facebook developed the open-sourced Hack language, an object-oriented programming language that is based on and compiles PHP. Their servers run HHVM (HipHop Virtual Machine). Hack supports the entirety of Facebook’s PHP codebase and resolves many security issues.
I’ve heard that CentOS was used for servers despite the distro being dead. They developed in parallel FBOSS, an OS for their network switches.
Homebrewed Hardware, too
I don’t understand much about hardware, so I won’t go into details, but Facebook has been developing its own hardware for a long time. I remember in 2013 when they also tried to compete with phone brands with the HTC First, but it was a partnership with HTC.
Sticking to Classic Databases
As expected, most large-scale databases use MySQL. They recently migrated to the 8.0 release, which wasn’t easy to do as it took multiple years.
Facebook.com also developed Cassandra in 2008 for their Messenger product. They rebuilt it as Rocksandra 10 years later, and promises a 10x decrease in latency.
A Massive Infrastructure for Big Data
Facebook rewrote a part of MySQL’s code to build the MyRocks DB server, which optimizes storage space and writing times. RocksDB, written in C++ and based on Google’s LevelDB, serves as the database.
They are very proud of their Scribe engine, which is an in-house message queue manager.
Apache Spark is used by the Machine Learning teams for training models. They moved from Hive which was apparently too slow (I don’t know enough to have an opinion on that).
Finally, Presto, launched in 2012, aims to optimize SQL queries for large volumes of data (Hive, HBase, Scribe…).
Unfamiliar Technologies for Automation and DevOps
If some of the previously mentioned services are unfamiliar to you, those that follow are even more obscure. At Facebook, we use the in-house Sapling for Git, which is a scalable Git server for large codebases. It took Facebook about a decade to develop the sl
tool internally, and they open-sourced it in 2022.
I don’t know what Facebook uses for its CI/CD integration and deployment pipelines, but they developed Buck2, a system for building their massive codebase coded in Rust.
For testing purposes, in-house Infer is being used for mobile testing (C, C++, Objective-C, Java) and Sapienz for testing user stories.
SLICK enables SLI monitoring (SLI-CK, get it ?). It’s useful for reaching 100% uptime. Unfortunately, not much information is available about this system, which remains closed source.
Finally, Docusaurus is their documentation CMS, featuring a rather adorable crocodile mascot. The result is very pretty, and they open-sourced the project.
Conclusion and Surprises
I was surprised to see that the Facebook.com teams developed so many in-house technologies, and that most are open source. It’s a good thing for the dev & ops communities and shows that the “least liked social network” still has a long way to go.
In 20 years, Facebook has made scalability a core concern and has successfully transitioned from its PHP monolith to a modern stack. They’ve been satisfying the needs of 2 billion daily users from scratch, all to their credit. What will be the next technical challenges for the social network is more exciting than ever.
“Facebook” banner generated by DALL•E