I have been tasked with designing an ecommerce solution. The aspect that is causing me the most problems is the database.
Currently the site consists of 10+ country based shops each with their own database (all residing on the same mysql instance).
For the new site I'd rather all these shop databases be merged into one database so that all tables (products, orders, customers etc.) have a shop_id field. From a programming perspective this seems to make the most sense as we won't have to manage data across multiple databases.
Currently the entire site generates about 120k orders a year, but is experiencing fairly heavy growth and we need to design a solution that will scale. In 5 years there may be more than a million orders per year and a database that contains 5 years order history (archiving maybe a solution here). The question is - do we use a single database, or do we keep the database-per-shop structure?
I am currently trying to find supporting evidence for either avenue. The company I am designing the solution for prefer the per-shop database structure because they believe it will allow the sites to scale. But my argument is that the shop's database probably won't get that busy over the next few years that they exceed the capacity of a mysql database and a "no expenses spared" hardware set-up.
I am wondering if anyone has any advice either way? Does anyone have experience with websites / ecommerce sites that have tables containing millions of records? I know there is probably not a clear answer here, but at what stage do we have too many records or too large table files to have a fast lo开发者_高级运维ading site?
Also, if anyone has any advice on sources of information - books, websites, etc. where I can do further research, it would be highly appreciated!
I've implemented a theater ticket sales solution which has tables with a couple of hundred thousand records and there are no performance issues to speak of (the hardware's nothing special). While it's hard for me to compare the loads, I would say it's unlikely that a 10x increase in data volume would noticeably impact performance. If it's the same application and the same schema, I'd very likely lean towards a single central database (probably with fail-over) because:
- maintenance is much more efficient and less error-prone
- you can do cross-shop reporting
- you can always offload the historic data to historic tables/databases if performance really suffers
and probably a number of other reasons. The obvious advantage of having multiple instances is that you get poor-man's high availability: if one server is down, only one shop doesn't work and you get this behaviour out of the box.
I would argue its easier to keep separate databases. It just makes more sense to have logical separation of these entities which have no direct relation. It will also be FAR easier to scale up, so that each site can run on separate hardware if/when the time comes. Backup/restore and general maintenance procedures will also be far easier on separate instances, because it allows for customised procedures per shop. Any disaster scenarios also only affect one logical database, rather than potentially screwing up every single shop.
Your current proposal will mean that just about every table will need a 'shop id' column, which is also indexed to prevent collisions. Separating this data later when you need to scale up wont be too much of an issue, but reprogramming most likely will be very time consuming.
精彩评论