I have two large tables, products (500k records) and store_products (> 3mm records). Products is the master and product_stores is individual locations with the product.
I need to run a single QUERY totaling up information from product_stores and updating the corresponding product.
When this was smaller dataset we did it with a nested query:
SELECT productid,COUNT(id) as count,MIN(price) as lowprice,MAX(price) as highprice FROM store_products
WHILE (productid){ update product set stores = count, min = lowprice, max = highprice WHERE productid = $productid }
GROUP BY productid
I'm fairly new to nested updates and开发者_高级运维 uncertain how to set multiple fields with a join and group by.
Structure [truncated to relevant fields]:
CREATE TABLE product (
product_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
stores INT UNSIGNED NOT NULL DEFAULT '0',
lowprice DECIMAL (6,2) NOT NULL DEFAULT '000.00',
highprice DECIMAL (6,2) NOT NULL DEFAULT '000.00',
PRIMARY KEY (product_id),
KEY stores (stores)
)
CREATE TABLE store_product (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
product_id INT UNSIGNED NOT NULL,
price DECIMAL(7,2) NOT NULL DEFAULT '0.00',
PRIMARY KEY (storeproduct_id),
KEY product_id (product_id)
);
Fields to update:
- stores [count of store_product records by productid]
- min price [MIN of price by productid]
- max price [MAX of price by productid]
Running a single query to perform the update on tables of this size will probably take a while. Anyway - the following should give you what you need. The trick is to alias the product table and then reference the product table in the subselect using that alias. So:
update product p
set p.lowprice = (select min(price) from store_product sp where sp.product_id = p.product_id),
p.highprice = (select max(price) from store_product sp where sp.product_id = p.product_id),
p.stores = (select count(*) from store_product sp where sp.product_id = p.product_id)
where product_id in (select sp.product_id from store_product sp);
One gotcha here is that the stores column will not be updated to 0 for rows that are not present in the store_product table. To cater for this you can use IFNULL while performing a global update:
update product p
set lowprice = ifnull((select min(price) from store_product sp where sp.product_id = p.product_id),0),
highprice = ifnull((select max(price) from store_product sp where sp.product_id = p.product_id),0),
stores = ifnull((select count(*) from store_product sp where sp.product_id = p.product_id),0);
You may want to try both out and see which is faster.
Hope this helps!
精彩评论