开发者

Getting limited amount of records from hierarchical data

开发者 https://www.devze.com 2022-12-19 05:57 出处:网络
Let\'s say I have 3 tables (significant columns only) Category (catId key, parentCatId) Category_Hierarchy (catId key, parentTrail, catLevel)

Let's say I have 3 tables (significant columns only)

  1. Category (catId key, parentCatId)
  2. Category_Hierarchy (catId key, parentTrail, catLevel)
  3. Product (prodId key, catId, createdOn)

There's a reason for having a separate Category_Hierarchy table, because I'm using triggers on Category table that populate it, because MySql triggers work as they do and I can't populate columns on the same table inside triggers if I would like to use auto_increment values. For the sake of this problem this is irrelevant. These two tables are 1:1 anyway.

Category table could be:

+-------+-------------+
| catId | parentCatId |
+-------+-------------+
|   1   | NULL        |
|   2   | 1           |
|   3   | 2           |
|   4   | 3           |
|   5   | 3           |
|   6   | 4           |
|  ...  | ...         |
+-------+-------------+

Category_Hierarchy

+-------+-------------+----------+
| catId | parentTrail | catLevel |
+-------+-------------+----------+
|   1   | 1/          | 0        |
|   2   | 1/2/        | 1        |
|   3   | 1/2/3/      | 2        |
|   4   | 1/2/3/4/    | 3        |
|   5   | 1/2/3/5/    | 3        |
|   6   | 1/2/3/4/6/  | 4        |
|  ...  | ...         | ...      |
+-------+-------------+----------+

Product

+--------+-------+---------------------+
| prodId | catId | cre开发者_如何学编程atedOn           |
+--------+-------+---------------------+
| 1      | 4     | 2010-02-03 12:09:24 |
| 2      | 4     | 2010-02-03 12:09:29 |
| 3      | 3     | 2010-02-03 12:09:36 |
| 4      | 1     | 2010-02-03 12:09:39 |
| 5      | 3     | 2010-02-03 12:09:50 |
| ...    | ...   | ...                 |
+--------+-------+---------------------+

Category_Hierarchy makes it simple to get category subordinate trees like this:

select c.*
from Category c
    join Category_Hierarchy h
    on (h.catId = c.catId)
where h.parentTrail like '1/2/3/%'

Which would return complete subordinate tree of category 3 (that is below 2, that is below 1 which is root category) including subordinate tree root node. Excluding root node is just one more where condition.

The problem

I would like to write a stored procedure:

create procedure GetLatestProductsFromSubCategories(in catId int)
begin
    /* return 10 latest products from each */
    /* catId subcategory subordinate tree  */
end;

This means if a certain category had 3 direct sub categories (with whatever number of nodes underneath) I would get 30 results (10 from each subordinate tree). If it had 5 sub categories I'd get 50 results.

What would be the best/fastest/most efficient way to do this? If possible I'd like to avoid cursors unless they'd work faster compared to any other solution as well as prepared statements, because this would be one of the most frequent calls to DB.

Edit

Since a picture tells 1000 words I'll try to better explain what I want using an image. Below image shows category tree. Each of these nodes can have an arbitrary number of products related to them. Products are not included in the picture.

Getting limited amount of records from hierarchical data

So if I'd execute this call:

call GetLatestProductsFromSubCategories(1);

I'd like to effectively get 30 products:

  • 10 latest products from the whole orange subtree
  • 10 latest products from the whole blue subtree and
  • 10 latest products from the whole green subtree

I don't want to get 10 latest products from each node under catId=1 node which would mean 320 products.


Final Solution

This solution has O(n) performance:

CREATE PROCEDURE foo(IN in_catId INT)
BEGIN
  DECLARE done BOOLEAN DEFAULT FALSE;
  DECLARE first_iteration BOOLEAN DEFAULT TRUE;
  DECLARE current VARCHAR(255);

  DECLARE categories CURSOR FOR
  SELECT parentTrail 
  FROM category 
  JOIN category_hierarchy USING (catId)
  WHERE parentCatId = in_catId;
  DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET done = TRUE;

  SET @query := '';

  OPEN categories;

  category_loop: LOOP
    FETCH categories INTO current;
    IF `done` THEN LEAVE category_loop; END IF;

    IF first_iteration = TRUE THEN
      SET first_iteration = FALSE;
    ELSE
      SET @query = CONCAT(@query, " UNION ALL ");
    END IF;

    SET @query = CONCAT(@query, "(SELECT product.* FROM product JOIN category_hierarchy USING (catId) WHERE parentTrail LIKE CONCAT('",current,"','%') ORDER BY createdOn DESC LIMIT 10)");

  END LOOP category_loop;
  CLOSE categories;

  IF @query <> '' THEN
    PREPARE stmt FROM @query;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;
  END IF;

END

Edit

Due to the latest clarification, this solution was simply edited to simplify the categories cursor query.

Note: Make the VARCHAR on line 5 the appropriate size based on your parentTrail column.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号