开发者

SQL - nested set model - how do I find the deepest parents (not leaf nodes) of each branch

开发者 https://www.devze.com 2023-02-09 23:55 出处:网络
Given a tree like this: A----B---------C----D E----FG H I need to find C and E (the two deepest nodes of each unique branch; A-C and A-E)

Given a tree like this:

A----B---------C----D
     |         |
     E----F    G
     |
     H

I need to find C and E (the two deepest nodes of each unique branch; A-C and A-E)

Our database uses the nested set model, as well as the adjacency list as a backup to maintain tree structure in case the left and right values get out of synch.

I can exclude al开发者_如何学运维l leaf nodes ( rgt = lft + 1 ) and the root node ( lft = 1 ) which leaves me with B E C. As you can imagine this is a greatly simplified example, some of our trees have over 100 nodes. How can I get rid of this noise in my data?

Here's the data for the example above if it were stored in our database.

 node | parent | lft | rgt |
------+--------+-----+-----+
   A  |  NULL  |    1|   16|
   B  |    A   |    2|   15|
   E  |    B   |    3|    8|
   F  |    E   |    4|    5|
   H  |    E   |    6|    7|
   C  |    B   |    9|   14|
   D  |    C   |   10|   11|
   G  |    C   |   12|   13|  

Thanks for your help!


You're right, the first step is to identify the leaf nodes through their property right equals left + 1. Next step is to find all parent nodes for those leaf nodes: these have a left smaller than that of the leaf and right larger than that of the leaf. And the final step is to exclude all parents but the one that has the largest left value (i.e. is closest to the leaf node).

An example in T-SQL, where l is the leaf node, p1 is the direct parent we're looking for and p2 is used to weed out all the non-direct parents.

First set up a test table with your example data in it:

if object_id('tempdb..#nsm') is not null
    drop table #nsm;

select v.node, v.parent, v.lft, v.rgt
into #nsm
from (
    values 
       ('A'  ,     NULL,     1,   16),
       ('B'  ,    'A'   ,    2,   15),
       ('E'  ,    'B'   ,    3,    8),
       ('F'  ,    'E'   ,    4,    5),
       ('H'  ,    'E'   ,    6,    7),
       ('C'  ,    'B'   ,    9,   14),
       ('D'  ,    'C'   ,   10,   11),
       ('G'  ,    'C'   ,   12,   13)
   ) v (node, parent, lft, rgt)

And here's the query that retrieves the both nodes C and E you requested:

select p1.node, p1.parent, p1.lft, p1.rgt
from #nsm p1
where exists (
            select *
            from #nsm l
            where l.rgt = l.lft + 1
                and p1.lft < l.lft
                and p1.rgt > l.rgt
                and not exists (
                        select *
                        from #nsm p2
                        where p2.lft < l.lft
                            and p2.rgt > l.rgt
                            and p2.lft > p1.lft
                    )
        )
0

精彩评论

暂无评论...
验证码 换一张
取 消