Update: After playing around with this for a few hours, went with a multi-query solution and used a table that only contained parent attributes to determine which items needed updating.
Sorry for the poor title, I couldn't think how to concisely describe this problem.
I have a set of items that should have a 1-to-1 relationship with an attribute.
I have a query to return those rows where the data is wrong and this relationship has been broken (1-to-many). I'm gathering these rows to fix them and restore this 1-to-1 relationship.
This is a theoretical simplification of my actual problem but I'll post example table schema here as it was requested.
item
table:
+------------+------------+-----------+
| item_id | name | attr_id |
+------------+------------+-----------+
| 1 | BMW 320d | 20 |
| 1 | BMW 320d | 21 |
| 2 | BMW 335i | 23 |
| 2 | BMW 335i | 34 |
+------------+------------+-----------+
attribute
table:
+-------开发者_StackOverflow中文版--+-----------------+------------+
| attr_id | value | parent_id |
+---------+-----------------+------------+
| 20 | SE | 21 |
| 21 | M Sport | 0 |
| 23 | AC | 24 |
| 24 | Climate control | 0 |
....
| 34 | Leather seats | 0 |
+---------+-----------------+------------+
A simple query to return items with more than one attribute.
SELECT item_id, COUNT(DISTINCT(attr_id)) AS attributes
FROM item GROUP BY item_id HAVING attributes > 1
This gets me a result set like so:
+-----------+------------+
| item_id | attributes |
+-----------+------------+
| 1 | 2 |
| 2 | 2 |
| 3 | 2 |
-- etc. --
However, there's an exception. The attribute
table can hold a tree structure, via parent links in the table. For certain rows, parent_id
can hold the ID of another attribute. There's only one level to this tree. Example:
+---------+-----------------+------------+
| attr_id | value | parent_id |
+---------+-----------------+------------+
| 20 | SE | 21 |
| 21 | M Sport | 0 |
....
I do not want to retrieve items in my original query where, for a pair of associated attributes, they related like attributes 20 & 21.
I do want to retrieve items where:
- the attributes have no parent
- for two or more attributes they are not related (e.g. attributes 23 & 34)
Example result desired, just the item ID:
+------------+
| item_id |
+------------+
| 2 |
+------------+
How can I join against attributes
from items
and exclude these rows?
Do I use a temporary table or can I achieve this from a single query?
Thanks.
The following query will extract only unique pairs of item and attribute (or its parent if any), thus eliminating the duplicates (this is per your request that an attribute can have only one parent, and the parent has no parents).
SELECT DISTINCT I.item_id AS iid, A.par_id AS aid
FROM
items AS I,
(SELECT AA.attr_id, IF(AA.parent_id = 0, AA.attr_id, AA.parent_id) AS par_id
FROM attribute AS AA) AS A
WHERE I.attr_id = A.attr_id
ORDER BY I.item_id
So, using the above query as a subtable for your counting query will work (same approach I used with the A subtable above):
SELECT SUB.iid, COUNT(DISTINCT(SUB.aid)) AS attributes
FROM
(SELECT DISTINCT I.item_id AS iid, A.par_id AS aid
FROM
items AS I,
(SELECT AA.attr_id, IF(AA.parent_id = 0, AA.attr_id, AA.parent_id) AS par_id
FROM attribute AS AA) AS A
WHERE I.attr_id = A.attr_id
ORDER BY I.item_id) AS SUB
GROUP BY SUB.iid
HAVING attributes > 1
I have added 3 more rows to your example items table, to accommodate the case, where an item can be linked only to an attribute with parent, but not the parent itself (i.e. item 3 -> 23 and 3 -> 20), and 4 -> 23.
Running the above query lists only items 2 and 3 with 2 attributes each.
You can achieve that with a single query:
SELECT
i.item_id,
COUNT(DISTINCT(i.attr_id)) AS attributes
FROM
items i
INNER JOIN
attributes a
ON i.attr_id = a.attr_id
WHERE
a.parent_id = 0
GROUP BY
i.item_id
HAVING
i.labels > 1
Well it seems it 's not possible with one query since we have nothing to group by, or nothing to sort on. The one thing left will be to do some recursive call but since there are no recursive SQL in mysql or if your attribute data have a rule where for all linked attribute attr_id < parent_id.
To simplify this, I updated all rows in item
with the parent attribute ID, where one is available.
So in my example item
table, with the attribute IDs updated it looks like:
+------------+------------+-----------+
| item_id | name | attr_id |
+------------+------------+-----------+
| 1 | BMW 320d | 21 |
| 1 | BMW 320d | 21 |
| 2 | BMW 335i | 23 |
| 2 | BMW 335i | 34 |
+------------+------------+-----------+
First I got a list of attribute relations (child-to-parent):
SELECT a.attr_id, a.parent_id FROM item i JOIN attribute a
USING (attr_id) WHERE parent_id > 0 GROUP BY a.attr_id
I looped around this in code and updated the rows in item
that referenced a child attribute.
$update = array();
foreach ($relations as $child => $parent) {
if (!isset($update[$parent]))
$update[$parent] = array();
$update[$parent][] = $child;
}
Loop around $update
to update item
. With this done I was able to use my original query:
SELECT item_id, COUNT(DISTINCT(attr_id)) AS attributes
FROM item GROUP BY item_id HAVING attributes > 1
I couldn't get one query to work.
精彩评论