REPHRASE: Scale at Facebook


Aditya Agarwal, Director of engineering at facebook, gave a very interesting presentation in May 28th at QCon illustrating Facebook architecture.

Aditya stressed on making it quickly to the market and not to over engineer. “Scalability is a good problem to have” as stated by Aditya.

Statistics:

1. users spend 8 billion a minute per day on Facebook
2. 5 billion piece of content shared every week
2. More than 400 million active users (logged in the the last 30 days)
3. Biggest photo site on the web! larger than the aggregated sum of all other sites
4. 250 active applications using the APIs with more than 1 million user a month
5. 80,000 application uses Facebook connect
6. 500,000 application
7. 2 million application developer
8. Thousands of memcache servers with 10s of terabytes of data

Architecture:

Composed from 4 main components

1. Web Servers/presentation layer
2. Services (fast and complicated)
3. Memcache (fast and simple)
4. Database (slow and persistent)

Web Servers/Presentation (PHP)
1. PHP is simple to work with (write, debug, read)
2. PHP is optimized for small websites and to get you off the ground very quickly
What makes PHP easy to develop makes the code base harder to scale (universal array, loose types)
3. PHP has a high footprint (CPU and memory usage)
4. PHP does not encourage good programming practices: No namespaces, modules were not designed from the begining
5. PHP extension are very hard to develop
6. PHP is not executed on production! HipHop, a code transformer, transfer PHP to optimized C++
7. Facebook enhancements: opt-code optimization, data cache, alternative PHP caching, lazy loading, serialization semantics, extensions, custom memcache client, asynchronous event handling mechannism, data types rewriting, and logging stats collection and monitoring

Services:

1. Written in different languages including C++, Erlang and Python
2. Using different languages incurs real overhaed for deployment and maintenance
3. Thrift – light-weight framework for cross programming language development – is used for interoperability
4. Scribe is used to load large amount of data
5. For Service Oriented Architecture, you have to build a lot of tools in-house

Memcache:

1. Very fast, very simple and reliable
2. Issues: Consistency with Database, limited data model, inefficient with small items and easy to corrupt
3. Facebook enhancements: working on 64 bit architecture, multi core machine, more effecient serialization,multi threading, improved protocol, new network driver and compression

Databases:

1. MySQL is used as a key value system (bit store)
2. No joins on production as data are distributed randomly
3. Storing non-static data in a centralized Database is not a good idea

Culture:

At the end of the presentation Aditya stressed on the importance of the culture. Aditya believes that culture is the most important thing.
Below the attributes of Facebook culture:
1. Reaching the market as soon as possible.
2. Training engineers to not be afraid of failure!Breaking things is ok!
3. Empowering engineers! things can be done in small teams as this promotes the proud and sense of ownership

You can watch the full presentation here

Leave a comment