Note: I've searched for开发者_Python百科 other similar Qs here, and none of the other answered questions are even remotely similar.. With that...
I have a question about MySql tables (more precisely, on specific fields from a table- i.e. tweets
or updates
).
So the question... what are the maximum amount of rows on an InnoDB table? That is if there is a significant amount of difference between the amounts MyIsam, InnoDB, and others can hold, if there isn't, then, in general. Secondly, if the table gets really large, what are the best practices for storing the data- (same one table, or split/multiple tables/dbs)?
I read that twitter gets something like 100 million tweets a day. In the same context, how would my second question apply to something like twitter?
There isn't a row limit, but a size limit on an InnoDB database:
The minimum tablespace size is 10MB. The maximum tablespace size is four billion database pages (64TB). This is also the maximum size for a table.
You could always horizontally partition your tables by storing rows in multiple partitions of the same table, in multiple files.
http://dev.mysql.com/doc/refman/5.5/en/innodb-restrictions.html will allow you to calculate restrictions from your key size.
But I humbly suggest that you don't want to store info like tweets in a transactional engine with referential integrity and other features of InnoDB, and you definitely don't store them all in one table in a single DB instance. You put them into lots and lots of independent non-SQL databases that allow you to append fast, and then extract meta-information (like hashtags, RTs, etc) into a more complex database. Presentations on Twitter DB architecture are easy to google (e.g. http://www.slideshare.net/nkallen/q-con-3770885).
If you must store large amounts of data in one table, partitioning is your friend, and possibly Postgres has better support for it. A partitioned table is physically several tables that logically look as one table. You put these sub-tables (partitions) to different disks, independently run maintenance on them, etc. Also, a "star schema" with one very long table that contains only essential columns and a number of tables that store bigger but optional columns may help.
The 64TB limit for a table has the following exceptions:
- That assumes
innodb_page_size = 16K
(the default). That can be set to powers of 2 between 4K and 64K, changing the tablespace limit proportionately. - A
PARTITIONed
table is essentially a bunch of 'sub-tables' organized together and acting as one big table. The pre-5.6.7 limit on number of partitions was 1024. After that, it has been 8192. So multiply the 64TB by that.
OK, that only gives you a byte limit. Furthermore, it includes overhead, and indexes. You then need to divide by how big an average record is to get the max number of rows.
(It is not easy to compute the average record size.)
Simple answer:
You can probably easily get 1 trillion "small" records in an InnoDB table. With some effort, you might get to 1000 trillion. But I suspect you budget for disk drives would be exhausted before that. Furthermore, it would take years to do all the INSERTs
to fill it up!
So, realistic answer: MySQL can handle an 'unlimited' number of rows.
Real life? I have heard of a few tables with more than a billion rows, even as much as 15 billion.
See my Limits, but it does not say more on the question of Rows.
精彩评论