I have to insert some data from external source.
My php function runs periodically and fetches results & insert those into table.There are often results that are already saved.
So they are again inserted.How do I discard those results that are duplicate so that onl开发者_StackOverflow中文版y new results are entered & duplicate are discarded.
If you can modify the structure of database tables, the best way would be adding UNIQUE INDEX for column or a number of columns which together uniquely identify a single entry. For example:
ALTER TABLE table
ADD UNIQUE INDEX name
(column_one
, column_two
);
If the table already contains duplicate records, the altering attempt will result in an error. In that case, you can use IGNORE:
ALTER IGNORE TABLE table
ADD UNIQUE INDEX name
(column_one
, column_two
);
Keep in mind that creating unnecessarily big indexes will result in:
- more disk space taken up
- slower writing queries (INSERT, UPDATE, DELETE)
http://dev.mysql.com/doc/refman/5.1/en/create-index.html (page describes another way of creating indexes but also references ALTER TABLE)
I know this sounds silly, but identify how you can detect duplicates and do not insert them.
This varies depending on your dataset. If you have a unique field the easiest way would be to set a unique index on that field in the database, a primary key, and the database will not allow a duplicate to be inserted. Just check your errors and if you get a duplicate key error, disregard as something you are expecting.
That's the point about primary keys, if you define your primary keys wisely you won't have to deal with duplicates because you won't have any.
The INSERT IGNORE ...
command does exactly this, assuming you have the primary key set on your table.
If you don't have a primary key already, set it to the name or id of your record where having the same name/id would indicate duplicate data.
All relational databases allow both primary keys and unique constraints on a single row.
So if you have a PERSON
table with a person_id
column and unique combinations of first_name
and last_name
columns, you should make person_id
the primary key and add a unique constraint on (first_name, last_name)
. You won't be able to INSERT a row that violates the unique constraint, even if the primary keys are unique.
精彩评论