开发者

How do large websites like Facebook distribute load? [closed]

开发者 https://www.devze.com 2023-02-01 03:55 出处:网络
Closed. This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this
Closed. This question needs to be more focused. It is not currently accepting answers.

Want to improve this question? Update the question so it focuses on one problem only by editing this post.

Closed 4 years ago.

Improve this question

This may be better suited to Server Fault, but it seems more of a programming challenge to me. I could be wrong.

I was thinking about how Facebook does what it does. It has over 500 million active users. How do they manage to serve all of those users? Is there one gigantic database holding a record for every single user so that whenever someone on logs in, authentication is checked against that central machine? I'm pretty ignorant about this topic but I can see that a solution like that is simple not scalable. There will come a point where that central server just can't handle everything.

Instead, say that central database is split up into 100 databases so that the load is split across all of them evenly. That must be what Facebook does, but how do they know which user record to store on what machine? Is there a record stored in every single machine and when you log in, a random user machine is used for authentication? That would mean every time someone registers or changes their password, the changes have to be propagated across all 100 servers.

One other solution comes to mind. Maybe they have some way of hashing a user's email address to a specific user datab开发者_JAVA技巧ase. Then all that would have to be known by the web servers is that hashing algorithm. But this solution brings up its own problem I think. What if you want to add more user database machines? Do you change the hashing algorithm to take into account 101 user databases instead of 100? Would you start moving user records around so the 101 user databases have the same number of user records? No, that seems ridiculous as well.

Anyways, as you can see I don't know too much about how to solve these problem. Does anyone have some recommended reading about this topic?


A good starting point might be to take a look at Cassandra (lecture notes), the distributed database that powers FB's inbox search.

Here's more about FB's nuts and bolts. You might also find some gems in the FB developer news.

0

精彩评论

暂无评论...
验证码 换一张
取 消