If I have two tables as listed below, The first which shows the Raw Data and the second which holds the Compressed Version of the Raw Data:
raw_table:
val
1
1
2
2
2
3
3
4
comp_table:
val count
1 2
2 3
3 2
4 1
I want to compress the raw_data into another table
INSERT INTO comp_table VALUES (
SELECT val, COUNT(val) FROM raw_table
WHERE val NOT IN(
SELECT val FROM comp_data
) GROUP BY val
)
First Questions: Is the 开发者_如何学运维above syntax correct?
Second Question: The count is updated, What would be the most efficient query to do the update?
Notes: The data size exceeds a Million Records in the raw_table
Thanks in advance :D
That query would probably work, but this is a good use case for ON DUPLICATE KEY UPDATE:
INSERT INTO comp_table (val, val_count)
SELECT
val, 1
FROM comp_data
ON DUPLICATE KEY UPDATE val_count = val_count + 1;
The first time the insert query is run, it will insert the value and set the count to 1. Anytime after that the same value is inserted, the count will be incremented.
Note: For this to work you'd have to have a unique key on the val column in the comp_table.
http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html
From what I got from Bradym: Credit goes to him
INSERT INTO comp_data
SELECT val, COUNT(val) FROM raw_data GROUP BY val
ON DUPLICATE KEY
UPDATE
count = (SELECT COUNT(val) FROM raw_data WHERE raw_data.val = comp_data.val)
Tried and Tested :D
精彩评论