无来

不管你来还是不来
我都在这里,夜夜点亮
不是为了守候
只是为了做好我自己

0%

预排序遍历树算法(modified preorder tree traversal algorithm)

本文是我学习MySQL官方教程Managing Hierarchical Data in MySQL的笔记 多层数据结构估计所有的web开发者估计都不会陌生,各种软件的分类都是基于多层结构来设计的。 下面是一个典型的多层数据结构示意图:

thum-be1b6320354e5c0879737b855764e9d920100326152818

相关创建数据语句:

CREATE TABLE category(
category_id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(20) NOT NULL,
parent INT DEFAULT NULL);

INSERT INTO category
VALUES(1,'ELECTRONICS',NULL),(2,'TELEVISIONS',1),(3,'TUBE',2),
(4,'LCD',2),(5,'PLASMA',2),(6,'PORTABLE ELECTRONICS',1),
(7,'MP3 PLAYERS',6),(8,'FLASH',7),
(9,'CD PLAYERS',6),(10,'2 WAY RADIOS',6);

SELECT * FROM category ORDER BY category_id;

在这种数据结构中,各层之间通过字段 parent 来形成邻接表,我们查询某些层级的关系的时候一般都是通过递归的方式,遍历某个层级关系的SQL的查询次数会顺着层级的增加,想想在层级有20的时候,根据某个底层节点取它到顶层节点的查询次数吧。

为了解决这个问题,人们想出了嵌套集模型(The Nested Set Model),请看下图:

thum-90ef2a4a3357d5de8e8e0d1bc026294820100326152302

上图依然是表现的与图一相同的层级关系,但是却更换了一种表现形式 下面是新的关系表和数据(关系和数据与之前相同,但是表结构不一样):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
CREATE TABLE nested_category (
category_id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(20) NOT NULL,
lft INT NOT NULL,
rgt INT NOT NULL,
deleted TINYINT(8) NOT NULL DEFAULT '0',
);

INSERT INTO nested_category
VALUES(1,'ELECTRONICS',1,20),(2,'TELEVISIONS',2,9),(3,'TUBE',3,4),
(4,'LCD',5,6),(5,'PLASMA',7,8),(6,'PORTABLE ELECTRONICS',10,19),
(7,'MP3 PLAYERS',11,14),(8,'FLASH',12,13),
(9,'CD PLAYERS',15,16),(10,'2 WAY RADIOS',17,18);

SELECT * FROM nested_category ORDER BY category_id;

这里将 left,right 修改为 lft,rgt因为这两个词在MYSQL中属于关键字 下面我们将插入的数据标识在图上: thum-a618905c95b8d6aa42cf4c29f7d0546a20100326152746 同样,我们将数据标识在原来的结构上: thum-be1b6320354e5c0879737b855764e9d920100326152818

怎么样,是不是很明确了

下面使我自己标定一种形式,方便理解

[1
      [2
           [3 4]
           [5 6]
           [7 8]
      9]
      [10
           [11
                 [12 13]
           14]
           [15 16]
           [17 18]
      19]
20]

遍历整个树,查询子集 条件:__ 左边 > 父级L,右边 < 父级R __

1
2
3
4
5
6
SELECT node.name
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND parent.name = 'ELECTRONICS'
ORDER BY node.lft;
  • 查询所有无分支的节点 条件: __ 右边 = 左边L + 1 __

    SELECT name FROM nested_category WHERE rgt = lft + 1;
  • 查询某个字节点到根节点的路径

    1
    2
    3
    4
    5
    6
    SELECT parent.name
    FROM nested_category AS node, nested_category AS parent
    WHERE
    node.lft BETWEEN parent.lft AND parent.rgt AND
    node.name = 'FLASH'
    ORDER BY parent.lft;
  • 查询节点的深度

    1
    2
    3
    4
    5
    SELECT node.name, (COUNT(parent.name) - 1) AS depth
    FROM nested_category AS node, nested_category AS parent
    WHERE node.lft BETWEEN parent.lft AND parent.rgt
    GROUP BY node.name
    ORDER BY node.lft;
  • 查询子节点的深度

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth
FROM
nested_category AS node,
nested_category AS parent,
nested_category AS sub_parent,
( SELECT node.name, (COUNT(parent.name) - 1) AS depth
FROM nested_category AS node, nested_category AS parent
WHERE
node.lft BETWEEN parent.lft AND parent.rgt AND
node.name = 'PORTABLE ELECTRONICS'
GROUP BY node.name ORDER BY node.lft
) AS sub_tree

WHERE
node.lft BETWEEN parent.lft AND parent.rgt AND
node.lft BETWEEN sub_parent.lft AND sub_parent.rgt AND
sub_parent.name = sub_tree.name
GROUP BY node.name
ORDER BY node.lft;
  • 插入新节点 算法详解:
  1. 所有分类 左边和右边的值 > 插入节点的左边节点记录的右值 的全部 + 2
  • 插入节点 左值 = 插入位置左边节点记录的右值 + 1, 右值 = 插入位置左边节点记录的右值 + 2

    例子: 在 R = 9(L8, R9)与 L = 10(L10,R11) 节点之间插入一个新节点 那么所有 左值 和 右值 > 9 的节点的左值和右值需要 + 2
    例如新节点右边的节点(L10,R11)左值右值都需要 + 2 那么插入后的新值为 L12 R13 新节点的左值为 9 + 1 = 10 右值为 9 + 2 = 11 SQL语句实现

1
2
3
4
5
6
7
8
LOCK TABLE nested_category WRITE;
SELECT @myRight := rgt FROM nested_category
WHERE name = 'TELEVISIONS';
UPDATE nested_category SET rgt = rgt + 2 WHERE rgt > @myRight;
UPDATE nested_category SET lft = lft + 2 WHERE lft > @myRight;
INSERT INTO nested_category(name, lft, rgt)
VALUES('GAME CONSOLES', @myRight + 1, @myRight + 2);
UNLOCK TABLES;
  • 删除新节点 删除节点的算法与添加一个节点的算法相反
  • 删除一个没有子节点的节点
1
2
3
4
5
6
LOCK TABLE nested_category WRITE;
SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1 FROM nested_category WHERE name = 'GAME CONSOLES';
DELETE FROM nested_category WHERE lft BETWEEN @myLeft AND @myRight;
UPDATE nested_category SET rgt = rgt - @myWidth WHERE rgt > @myRight;
UPDATE nested_category SET lft = lft - @myWidth WHERE lft > @myRight;
UNLOCK TABLES;
  • 删除一个分支节点和它所有的子节点
1
2
3
4
5
6
LOCK TABLE nested_category WRITE;
SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1 FROM nested_category WHERE name = 'MP3 PLAYERS';
DELETE FROM nested_category WHERE lft BETWEEN @myLeft AND @myRight;
UPDATE nested_category SET rgt = rgt - @myWidth WHERE rgt > @myRight;
UPDATE nested_category SET lft = lft - @myWidth WHERE lft > @myRight;
UNLOCK TABLES;
  • 删除一个节点后移动其字节点到
1
2
3
4
5
6
7
LOCK TABLE nested_category WRITE;
SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1 FROM nested_category WHERE name = 'PORTABLE ELECTRONICS';
DELETE FROM nested_category WHERE lft = @myLeft;
UPDATE nested_category SET rgt = rgt - 1, lft = lft - 1 WHERE lft BETWEEN @myLeft AND @myRight;
UPDATE nested_category SET rgt = rgt - 2 WHERE rgt > @myRight;
UPDATE nested_category SET lft = lft - 2 WHERE lft > @myRight;
UNLOCK TABLES;

补充:移动一个分类

将分类5移动到分类9下面

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
LOCK TABLE nested_category WRITE;
SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1
FROM nested_category
WHERE category_id = 5;
UPDATE nested_category SET deleted=2 WHERE lft BETWEEN @myLeft AND @myRight;
UPDATE nested_category SET rgt = rgt - @myWidth WHERE rgt > @myRight AND deleted=0;
UPDATE nested_category SET lft = lft - @myWidth WHERE lft > @myRight AND deleted=0;

SELECT @destLeft := lft,@destRight := rgt FROM nested_category WHERE category_id = 9;

UPDATE nested_category SET rgt = rgt + @myWidth WHERE rgt > @destRight AND deleted=0;
UPDATE nested_category SET lft = lft + @myWidth WHERE lft > @destRight AND deleted=0;

UPDATE nested_category SET rgt = rgt -(@myLeft-@destRight) WHERE deleted=2;
UPDATE nested_category SET lft = lft -(@myLeft-@destRight) WHERE deleted=2;
UPDATE nested_category SET rgt = rgt + @myWidth WHERE category_id = 9;
UPDATE nested_category SET deleted=0 WHERE deleted=2;
UNLOCK TABLES;

将分类5移动到分类9前面

1
2
3
4
5
6
7
8
9
10
11
12
13
14
SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1 FROM nested_category WHERE  category_id = 5;
UPDATE nested_category SET deleted=2 WHERE lft BETWEEN @myLeft AND @myRight;
UPDATE nested_category SET rgt = rgt - @myWidth WHERE rgt > @myRight AND deleted=0;
UPDATE nested_category SET lft = lft - @myWidth WHERE lft > @myRight AND deleted=0;

SELECT @destLeft := lft,@destRight := rgt FROM nested_category WHERE category_id = 9;

UPDATE nested_category SET rgt = rgt + @myWidth WHERE rgt >= @destLeft AND deleted=0;
UPDATE nested_category SET lft = lft + @myWidth WHERE lft >= @destLeft AND deleted=0;

UPDATE nested_category SET rgt = rgt -(@myLeft-@destRight) WHERE deleted=2;
UPDATE nested_category SET lft = lft -(@myLeft-@destRight) WHERE deleted=2;
UPDATE nested_category SET deleted=0 WHERE deleted=2;
UNLOCK TABLES;

总结:

预排序遍历树算法的核心就是牺牲了写的性能来换取读取的性能 在你的开发的应用遇到此类问题的时 (读压力 > 写压力),尝试下使用预排序遍历树算法来提高你的程序的性能吧。