What is the best way to periodically load data into table_问答_开发者

What is the best way to periodically load data into table

开发者 https://www.devze.com 2022-12-15 04:00 出处：网络

I have a database with static tables which require to be updated from CSV weekly. Tables are Mysql MyISAM and by static i mean they are used for开发者_Python百科 read only (except when updated from CV

There're about 50 tables and in total about 200mb of data to be reloaded weekly.

I can think about 3 ways:

Truncate table
Load data from files

For each table create a temporary table
Load data there
Truncate (or delete rows?) original table
Insert into original table select * from temporary table.

Create table_new and load data there
Rename original table to table_old (or drop table altogether)
Rename table_new into original table

What do you reckon is the most efficient way?

Have you considered using mysqlimport? You can read about it here: http://dev.mysql.com/doc/refman/5.1/en/mysqlimport.html

I probably wouldn't do anything with deleting the original tables, because then you have to re-create all your foreign keys, indexes, constraints, etc. which is a mess and a maintenance nightmare. Renaming tables can also cause problems (like if you have synonyms for the tables, I'm not sure if mysql has synonyms though).

What I would do, however, is disable the keys before loading the data.

ALTER TABLE tbl_name DISABLE KEYS

In other words, when loading the data you don't want it to be trying to update indexes because that will slow down the load. You want the indexes updated once the load is completed.

So I think by combining mysqlimport with the tip above, you should be able to get a really efficient load.

You could always do INSERT INTO ... ON DUPLICATE KEY UPDATE ... or REPLACE INTO .... You shouldn't get any down time (between a TRUNCATE and INSERT), and there's very little chance of corruption.

Be careful with REPLACE, since it will actually delete each record and re-insert it, firing any triggers you may have (unlikely in this case), but also giving you a new ID if you have an auto-increment field.

Your third option is the best, you can LOCK and DISABLE KEYS on the _new table while importing, and it'll be extra quick. You can even do a "batch atomic rename" of all your new tables to the "current ones", with zero downtime if they have relations between them.

I'm assuming the whole tables are contained in the weekly cvs updates (i.e. they're not incremental).

I would prefer the 3rd method and also keep the old table.

create table_new

drop table_old if exists

rename table to table_old

rename table_new to table

The advantage of this method is that it fast and safe with less effect on the readers. The creation of new table does not affect reads on existing table. The rename operation is faster (just a file rename in case of myisam) so the downtime is not that much. So the clients will not be affected by this that much. You also got to keep the old data in case something is wrong with the new data.

As you are not going to update it online I think it will be good if you do myisampack.