I have a 9 million rows table and I'm struggling to handle all this data because of its sheer size.
What I want to do is add IMPORT a CSV to the table without overwriting data.
Before I would of done something like this; INSERT if not in(select email from tblName where source = "number" and email != "email") INTO (email...) VALUES ("email"...)
But I'm worried that I'll crash the server again. I want to be able to insert 10,000s of ro开发者_JAVA技巧ws into a table but only if its not in the table with source = "number".
Otherwise I would of used unique on the email column.
In short, I want to INSERT as quickly as possible without introducing duplicates to the table by checking two things. If email != "email" AND source != "number" then insert into table otherwise do nothing. And I dont want errors reports either.
I'm sorry for my bad wording and the question sounding a little silly.
I'm just having a hard time adabting to not been able to test it out on the data by downloading backups and uploading if it goes wrong. I hate large datasets :)
Thank-you all for your time -BigThings
If you have unique keys on these fields you can use LOAD DATA INFILE with IGNORE option. It's faster then inserting row by row, and is faster then multi-insert as well.
Look at http://dev.mysql.com/doc/refman/5.1/en/load-data.html
Set a UNIQUE
constraint on email
and source
columns.
Then do:
INSERT INTO table_name(email, source, ...) VALUES ('email', 'source', ...)
ON DUPLICATE KEY UPDATE email = email;
INSERT IGNORE
will not notify you of any kind of error. I would not recommend it. Neither would I recommend INSERT ... WHERE NOT IN
. MySQL has an already well optimized functionality for that. That's why INSERT ... ON DUPLICATE KEY UPDATE
is there.
精彩评论