How do you rewrite a website to be scalable?(traffic) I work with mainly PHP and some Ruby on rails and i know its a generic question. I'm just looking to increase my knowledge so any advice would be useful.
Thank you in advance ;-)
This is a quite wide question, and it's going to be pretty difficult to give you a definite answer -- but a couple of ideas :
- First of all, do not pre-optimize !
- Make sure your application works ; that's the most important thing.
- And, only when it becomes necessary, start optimizing.
- PHP by itself generally scales well :
- Add a couple more Apache+PHP servers, load-balance your users
- And this tends to work really easily
- Filesystem doesn't scale that well / that easily :
- it's not shared accross servers
- sharing filesystem (with NFS, for instance) can sometimes cause problems.
- The database is generally the hardest part, when it comes to scaling :
- Having more than one "write" server is hard
- Having more than a couple of "read" servers, generally using replication, can become a pain for maintenance
- You'll have to think about sharding, one day or another, if replication is not enough.
- Use lots of caching : the more cache you can use, the less queries you'll make to the DB, the better it'll be
- memcached is great, and scales well : just add a couple of servers, and you get a couple of more GB or memory in your caching-cluster
- Using a reverse-proxy, so your Apache+PHP servers have less work to do, helps too.
And a quick couple of links that might give you some ideas :
- Database Sharding at Netlog, with MySQL and PHP
- Scaling WikiPedia with LAMP: 7 billion page views per month
One tip - cache data using memcached or an equivalent, instead of querying the database directly.
Also, the most difficult part of scaling is moving beyond a single web server. Once you can scale to two web servers, you should not have much trouble scaling to many more.
Get a PHP accelerator, you will definatly have a noticable performance increase, Wikipedia has a nice list to choose from. And as Justin said get memcached, it is amazing.
"Scale" isn't a universal, concrete phenomena, but a relative measure of performance and capacity under a specific set of criteria. So you need a set of criteria and some metrics in order for this conversation to have any meaning at all.
I have found Apdex to be a very useful mechanism for thinking and reasoning about the metrics required:
Apdex (Application Performance Index) is an open standard developed by an alliance of companies that defines a standardized method to report, benchmark, and track application performance.
The beauty of a system like an Apdex Index is that it is directly related to users' perceptions of satisfactory application responsiveness. These are the only things that actually matter in any discussion of scale and performance.
So, for example, when thinking about your system in this way, you determine the response rate required to meet your user's expectation of responsiveness, you estimate the level of traffic you will need to support, then add capacity to meet your targets.
精彩评论