Skip to content

What database does Facebook use?

The home screen of the FOSS edition of MySQL W...

What database does Facebook use is one of the most common questions asked when folks start taking about what database is the most scalable for large scale web applications.   In fact, it is usually a person who is an open source proponent, and knows very well that Facebook uses MySQL as their core database engine.  Because of this fact, this is often the single biggest reason that developers use to push to get MySQL used in their company.  I would imagine that is why it is a very popular Google query.

While Facebook uses MySQL, they do not use it as-is out of the box.  In fact, their team has submitted numerous high-performance enhancements to the MySQL core and Innodb plug-in.  Their main focus has been on adding performance counters to Innodb.  Other changes focused on the IO sub-system, including the following new features :

  • innodb_io_capacity – sets the IO capacity of the server to determine rate limits for background IO
  • innodb_read_io_threads, innodb_write_io_threads – set the number of background IO threads
  • innodb_max_merged_io – sets the maximum number of adjacent IO requests that may be merged into a large IO request

Facebook uses MySQL as a key-value store in which data is randomly distributed across a large set of logical instances. These logical instances are spread out across physical nodes and load balancing is done at the physical node level.  Facebook has developed a partitioning scheme in which a global ID is assigned to all user data. They also have a custom archiving scheme that is based on how frequent and recent data is on a per-user basis. Most data is distributed randomly.  Amazingly, it has been rumored that Facebook has 1800 MySQL servers, but only 3 full-time DBAs.

Facebook primarily uses MySQL for structured data storage such as wall posts, user information, etc. This data is replicated between their various data centers. For blob storage (photos, video, etc.), Facebook makes use of a custom solution that involves a CDN externally and NFS internally.

It is also important to note that Facebook makes heavy use of Memcache,  a memory caching system that is used to speed up dynamic database-driven websites by caching data and objects in RAM to reduce reading time. Memcache is Facebook’s primary form of caching and greatly reduces the database load. Having a caching system allows Facebook to be as fast as it is at recalling your data. If it doesn’t have to go to the database it will just fetch your data from the cache based on your user ID.

So, while “What database does Facebook use?” seems like a simple question, you can see that they have added a variety of other systems to make it truly web scalable.  But, still feel free to use the argument, “MySQL is as good or better than Oracle or MS SQL Server, heck, even Facebook uses it, and they have 500 Million users!”.

Be Sociable, Share!

6 Comments

  1. Mike wrote:

    Interesting article :) It made me shudder reading it though, databases not being my strong point and all.

    It’s enough to make a grown man or woman faint thinking about the scale of Facebook’s data.

    Monday, April 18, 2011 at 7:17 pm | Permalink
  2. Alex wrote:

    Awesome article. I work with Postgres, and I am very sure that it will overcome MySql someday soon :)

    Monday, June 4, 2012 at 6:34 am | Permalink
  3. Mike O wrote:

    You forgot to add that Facebook uses NoSQL, namely Cassandra.

    Wednesday, May 7, 2014 at 4:25 pm | Permalink
  4. username wrote:

    What database does Facebook use is one of the most common questions asked when folks start taking about what database is the most scalable for large scale web applications. In fact, it is usually a person who is an open source proponent, and knows very well that Facebook uses MySQL as their core database engine. Because of this fact, this is often the single biggest reason that developers use to push to get MySQL used in their company. I would imagine that is why it is a very popular Google query.

    While Facebook uses MySQL, they do not use it as-is out of the box. In fact, their team has submitted numerous high-performance enhancements to the MySQL core and Innodb plug-in. Their main focus has been on adding performance counters to Innodb. Other changes focused on the IO sub-system, including the following new features :

    innodb_io_capacity – sets the IO capacity of the server to determine rate limits for background IO
    innodb_read_io_threads, innodb_write_io_threads – set the number of background IO threads
    innodb_max_merged_io – sets the maximum number of adjacent IO requests that may be merged into a large IO request

    Facebook uses MySQL as a key-value store in which data is randomly distributed across a large set of logical instances. These logical instances are spread out across physical nodes and load balancing is done at the physical node level. Facebook has developed a partitioning scheme in which a global ID is assigned to all user data. They also have a custom archiving scheme that is based on how frequent and recent data is on a per-user basis. Most data is distributed randomly. Amazingly, it has been rumored that Facebook has 1800 MySQL servers, but only 3 full-time DBAs.

    Facebook primarily uses MySQL for structured data storage such as wall posts, user information, etc. This data is replicated between their various data centers. For blob storage (photos, video, etc.), Facebook makes use of a custom solution that involves a CDN externally and NFS internally.

    It is also important to note that Facebook makes heavy use of Memcache, a memory caching system that is used to speed up dynamic database-driven websites by caching data and objects in RAM to reduce reading time. Memcache is Facebook’s primary form of caching and greatly reduces the database load. Having a caching system allows Facebook to be as fast as it is at recalling your data. If it doesn’t have to go to the database it will just fetch your data from the cache based on your user ID.

    So, while “What database does Facebook use?” seems like a simple question, you can see that they have added a variety of other systems to make it truly web scalable. But, still feel free to use the argument, “MySQL is as good or better than Oracle or MS SQL Server, heck, even Facebook uses it, and they have 500 Million users!”. ‘ ”’

    Monday, June 16, 2014 at 1:42 am | Permalink
  5. Hi! I’m Bornstein. Can anyone tell me if bespoke languages tuition is the only language tutoring service which is accredited by Cambridge University and Harvard University? Also, I know they specialise in French lessons, German lessons, Spanish tutoring and 11 Plus tuition, but does anyone know whether they intend to provide any other languages? We’re pretty sure they’re global!

    Wednesday, October 8, 2014 at 10:27 pm | Permalink
  6. Luis wrote:

    Hello, i feel that i saw you visited my blog so i came to return the prefer?.I’m attempting to to find
    things to enhance my web site!I assume its ok to use some of your ideas!!
    Luis´s last blog post ..Luis

    Wednesday, October 15, 2014 at 9:56 pm | Permalink

Post a Comment

Your email is never published nor shared. Required fields are marked *
*
*

CommentLuv badge