B-树原理与 Java 模拟实现详解

B-树基础概念

B-树是一种平衡的多路查找树，通常用于数据库和文件系统的索引结构。它通过增加每个节点的关键字数量来减少树的高度，从而显著降低磁盘 I/O 次数。

核心特性

根节点：至少有两个孩子（除非它是叶子节点）。
非根节点：至少有 ceil(M/2) - 1 个关键字，至多有 M - 1 个关键字，且按升序排列。
子节点数：每个非根节点至少有 ceil(M/2) 个孩子，至多有 M 个孩子。
区间约束：关键字 key[i] 和 key[i+1] 之间的子树值介于两者之间。
平衡性：所有叶子节点都在同一层，通过分裂和合并保持平衡。

插入操作分析

以 M=3 的 B-树为例，每个节点最多存 2 个关键字。为了方便理解，我们允许节点在插入第三个关键字时暂时存储，随后立即触发分裂。

插入流程

假设我们要插入序列 [53, 139, 75, 49, 145, 36, 101]。

初始插入：若树为空，直接创建根节点。
查找位置：从根节点向下遍历，找到合适的叶子节点位置。注意，B-树的插入一定发生在叶子节点。
插入元素：在找到的节点中按顺序插入新关键字。
检查溢出：如果节点关键字数量达到上限（M），则必须分裂。

文章配图

当节点满员时，我们需要将中间关键字提升到父节点，剩余部分分裂成两个新节点。如果父节点也满了，这个过程会递归向上，直到根节点分裂或不再需要分裂。

文章配图

Java 模拟实现

下面给出完整的 Java 代码实现。为了清晰展示逻辑，我们将辅助类 Pair 和主类 MyBtree 分开定义。

数据结构定义

public class Pair<K,V> {
    private K key;
    private V val;
    
    public Pair(K key, V val) {
        this.key = key;
        this.val = val;
    }
    
    public K getKey() { return key; }
    public void setKey(K key) { this.key = key; }
    public V getVal() { return val; }
    public void setVal(V val) { this.val = val; }
}

核心逻辑

1. 查找插入位置

为什么返回 Pair<BTRNode, Integer>？因为我们需要区分两种情况：一是找到了已存在的 Key，二是找到了插入位置。仅返回节点无法区分，仅返回下标无法定位节点。使用 Pair 可以一次性获取节点引用和在该节点中的索引。

private Pair<BTRNode, Integer> find(int key) {
    BTRNode cur = root;
    BTRNode parent = null;
    while (cur != null) {
        int i = 0;
        while (i < cur.usedSize) {
            if (cur.keys[i] == key) {
                // 存在该元素，返回节点及下标
                return new Pair<>(cur, i);
            } else if (cur.keys[i] < key) {
                i++;
            } else {
                break;
            }
        }
        parent = cur;
        cur = cur.subs[i];
    }
    // 未找到，返回父节点及 -1 表示插入位置
    return new Pair<>(parent, -1);
}

2. 插入与分裂

插入时先判断是否重复，再寻找位置。如果节点未满直接插入；如果满了，调用 split 方法。

public boolean insert(int key) {
    // 空树处理
    if (root == null) {
        root = new BTRNode();
        root.keys[0] = key;
        root.usedSize++;
        return true;
    }

    // 查找是否存在
    Pair<BTRNode, Integer> pair = find(key);
    if (pair.getVal() != -1) {
        return false; // 已存在
    }

    // 插入到叶子节点
    BTRNode parent = pair.getKey();
    int index = parent.usedSize - 1;
    for (; index >= 0; index--) {
        if (parent.keys[index] >= key) {
            parent.keys[index + 1] = parent.keys[index];
        } else {
            break;
        }
    }
    parent.keys[index + 1] = key;
    parent.usedSize++;

    // 检查是否需要分裂
    if (parent.usedSize < M) {
        return true;
    } else {
        split(parent);
        return true;
    }
}

3. 节点分裂

这是最复杂的部分。需要将当前节点的中间元素上提，右侧元素和新节点分离，并更新父指针。

private void split(BTRNode cur) {
    BTRNode newNode = new BTRNode();
    BTRNode parent = cur.parent;
    
    // 确定中间位置
    int mid = cur.usedSize >> 1;
    int i = mid + 1;
    int j = 0;
    
    // 移动数据和子节点到新节点
    for (; i < cur.usedSize; i++) {
        newNode.keys[j] = cur.keys[i];
        newNode.subs[j] = cur.subs[i];
        if (newNode.subs[j] != null) {
            newNode.subs[j].parent = newNode;
        }
        j++;
    }
    // 多拷贝一次子节点指针
    newNode.subs[j] = cur.subs[i];
    if (newNode.subs[j] != null) {
        newNode.subs[j].parent = newNode;
    }
    
    newNode.usedSize = j;
    cur.usedSize = cur.usedSize - j - 1;
    
    // 特殊处理根节点分裂
    if (cur == root) {
        root = new BTRNode();
        root.keys[0] = cur.keys[mid];
        root.subs[0] = cur;
        root.subs[1] = newNode;
        root.usedSize = 1;
        cur.parent = root;
        newNode.parent = root;
        return;
    }
    
    // 更新父节点
    newNode.parent = parent;
    int endT = parent.usedSize - 1;
    int midVal = cur.keys[mid];
    
    for (; endT >= 0; endT--) {
        if (parent.keys[endT] >= midVal) {
            parent.keys[endT + 1] = parent.keys[endT];
            parent.subs[endT + 2] = parent.subs[endT + 1];
        } else {
            break;
        }
    }
    parent.keys[endT + 1] = midVal;
    parent.subs[endT + 2] = newNode;
    parent.usedSize++;
    
    // 递归检查父节点是否也需要分裂
    if (parent.usedSize >= M) {
        split(parent);
    }
}

4. 中序遍历

验证数据有序性的关键方法。

private void inorder(BTRNode root) {
    if (root == null) return;
    for (int i = 0; i < root.usedSize; ++i) {
        inorder(root.subs[i]);
        System.out.println(root.keys[i]);
    }
    inorder(root.subs[root.usedSize]);
}

完整测试

public static void main(String[] args) {
    MyBtree myBtree = new MyBtree();
    int[] array = {53, 139, 75, 49, 145, 36, 101};
    for (int i = 0; i < array.length; i++) {
        myBtree.insert(array[i]);
    }
    myBtree.inorder(myBtree.root);
}

运行结果将输出有序序列，证明插入逻辑正确。

文章配图

B-树的删除

删除操作相对复杂，主要涉及借位和合并。

非叶子节点删除：用后继关键字覆盖待删关键字，然后在后继所在子树中递归删除。
叶子节点不足：如果删除后节点关键字少于下限，尝试向兄弟借位。
合并：如果兄弟也无多余关键字，则将父节点关键字下移，与当前节点及兄弟节点合并。

文章配图

变种对比：B+ 树与 B* 树

B+ 树

B+ 树是 B-树的改进版，广泛用于数据库索引。

优势：所有数据都在叶子节点，形成有序链表，适合范围查询；非叶子节点只存索引，能容纳更多键值，树更矮。
场景：MySQL、PostgreSQL 等关系型数据库。

文章配图

B* 树

B* 树进一步提高了空间利用率。

特点：非叶子节点关键字下限提高至 2/3M，分裂时尽量分配给兄弟而非新建节点。
优势：更高的空间利用率和更好的平衡性。

总结

B-树通过平衡多路搜索有效解决了大规模数据的存储与检索问题。其核心在于通过控制节点大小来平衡树高，减少磁盘 I/O。在实际工程中，虽然 B+ 树更为常见，但理解 B-树的分裂与合并机制是掌握底层索引技术的基石。

B-树基础概念

核心特性

根节点：至少有两个孩子（除非它是叶子节点）。
非根节点：至少有 ceil(M/2) - 1 个关键字，至多有 M - 1 个关键字，且按升序排列。
子节点数：每个非根节点至少有 ceil(M/2) 个孩子，至多有 M 个孩子。
区间约束：关键字 key[i] 和 key[i+1] 之间的子树值介于两者之间。
平衡性：所有叶子节点都在同一层，通过分裂和合并保持平衡。

插入操作分析

以 M=3 的 B-树为例，每个节点最多存 2 个关键字。为了方便理解，我们允许节点在插入第三个关键字时暂时存储，随后立即触发分裂。

插入流程

假设我们要插入序列 [53, 139, 75, 49, 145, 36, 101]。

初始插入：若树为空，直接创建根节点。
查找位置：从根节点向下遍历，找到合适的叶子节点位置。注意，B-树的插入一定发生在叶子节点。
插入元素：在找到的节点中按顺序插入新关键字。
检查溢出：如果节点关键字数量达到上限（M），则必须分裂。

文章配图

Java 模拟实现

下面给出完整的 Java 代码实现。为了清晰展示逻辑，我们将辅助类 Pair 和主类 MyBtree 分开定义。

数据结构定义

public class Pair<K,V> {
    private K key;
    private V val;
    
    public Pair(K key, V val) {
        this.key = key;
        this.val = val;
    }
    
    public K getKey() { return key; }
    public void setKey(K key) { this.key = key; }
    public V getVal() { return val; }
    public void setVal(V val) { this.val = val; }
}

核心逻辑

1. 查找插入位置

private Pair<BTRNode, Integer> find(int key) {
    BTRNode cur = root;
    BTRNode parent = null;
    while (cur != null) {
        int i = 0;
        while (i < cur.usedSize) {
            if (cur.keys[i] == key) {
                // 存在该元素，返回节点及下标
                return new Pair<>(cur, i);
            } else if (cur.keys[i] < key) {
                i++;
            } else {
                break;
            }
        }
        parent = cur;
        cur = cur.subs[i];
    }
    // 未找到，返回父节点及 -1 表示插入位置
    return new Pair<>(parent, -1);
}

2. 插入与分裂

插入时先判断是否重复，再寻找位置。如果节点未满直接插入；如果满了，调用 split 方法。

public boolean insert(int key) {
    // 空树处理
    if (root == null) {
        root = new BTRNode();
        root.keys[0] = key;
        root.usedSize++;
        return true;
    }

    // 查找是否存在
    Pair<BTRNode, Integer> pair = find(key);
    if (pair.getVal() != -1) {
        return false; // 已存在
    }

    // 插入到叶子节点
    BTRNode parent = pair.getKey();
    int index = parent.usedSize - 1;
    for (; index >= 0; index--) {
        if (parent.keys[index] >= key) {
            parent.keys[index + 1] = parent.keys[index];
        } else {
            break;
        }
    }
    parent.keys[index + 1] = key;
    parent.usedSize++;

    // 检查是否需要分裂
    if (parent.usedSize < M) {
        return true;
    } else {
        split(parent);
        return true;
    }
}

3. 节点分裂

这是最复杂的部分。需要将当前节点的中间元素上提，右侧元素和新节点分离，并更新父指针。

private void split(BTRNode cur) {
    BTRNode newNode = new BTRNode();
    BTRNode parent = cur.parent;
    
    // 确定中间位置
    int mid = cur.usedSize >> 1;
    int i = mid + 1;
    int j = 0;
    
    // 移动数据和子节点到新节点
    for (; i < cur.usedSize; i++) {
        newNode.keys[j] = cur.keys[i];
        newNode.subs[j] = cur.subs[i];
        if (newNode.subs[j] != null) {
            newNode.subs[j].parent = newNode;
        }
        j++;
    }
    // 多拷贝一次子节点指针
    newNode.subs[j] = cur.subs[i];
    if (newNode.subs[j] != null) {
        newNode.subs[j].parent = newNode;
    }
    
    newNode.usedSize = j;
    cur.usedSize = cur.usedSize - j - 1;
    
    // 特殊处理根节点分裂
    if (cur == root) {
        root = new BTRNode();
        root.keys[0] = cur.keys[mid];
        root.subs[0] = cur;
        root.subs[1] = newNode;
        root.usedSize = 1;
        cur.parent = root;
        newNode.parent = root;
        return;
    }
    
    // 更新父节点
    newNode.parent = parent;
    int endT = parent.usedSize - 1;
    int midVal = cur.keys[mid];
    
    for (; endT >= 0; endT--) {
        if (parent.keys[endT] >= midVal) {
            parent.keys[endT + 1] = parent.keys[endT];
            parent.subs[endT + 2] = parent.subs[endT + 1];
        } else {
            break;
        }
    }
    parent.keys[endT + 1] = midVal;
    parent.subs[endT + 2] = newNode;
    parent.usedSize++;
    
    // 递归检查父节点是否也需要分裂
    if (parent.usedSize >= M) {
        split(parent);
    }
}

4. 中序遍历

验证数据有序性的关键方法。

private void inorder(BTRNode root) {
    if (root == null) return;
    for (int i = 0; i < root.usedSize; ++i) {
        inorder(root.subs[i]);
        System.out.println(root.keys[i]);
    }
    inorder(root.subs[root.usedSize]);
}

完整测试

public static void main(String[] args) {
    MyBtree myBtree = new MyBtree();
    int[] array = {53, 139, 75, 49, 145, 36, 101};
    for (int i = 0; i < array.length; i++) {
        myBtree.insert(array[i]);
    }
    myBtree.inorder(myBtree.root);
}

运行结果将输出有序序列，证明插入逻辑正确。

文章配图

B-树的删除

删除操作相对复杂，主要涉及借位和合并。

非叶子节点删除：用后继关键字覆盖待删关键字，然后在后继所在子树中递归删除。
叶子节点不足：如果删除后节点关键字少于下限，尝试向兄弟借位。
合并：如果兄弟也无多余关键字，则将父节点关键字下移，与当前节点及兄弟节点合并。

文章配图

变种对比：B+ 树与 B* 树

B+ 树

B+ 树是 B-树的改进版，广泛用于数据库索引。

优势：所有数据都在叶子节点，形成有序链表，适合范围查询；非叶子节点只存索引，能容纳更多键值，树更矮。
场景：MySQL、PostgreSQL 等关系型数据库。

文章配图

B* 树

B* 树进一步提高了空间利用率。

特点：非叶子节点关键字下限提高至 2/3M，分裂时尽量分配给兄弟而非新建节点。
优势：更高的空间利用率和更好的平衡性。

B-树原理与 Java 模拟实现详解

B-树基础概念

核心特性

插入操作分析

插入流程

Java 模拟实现

数据结构定义

核心逻辑

1. 查找插入位置

2. 插入与分裂

3. 节点分裂

4. 中序遍历

完整测试

B-树的删除

变种对比：B+ 树与 B* 树

B+ 树

B* 树

总结

B-树原理与 Java 模拟实现详解

B-树基础概念

核心特性

插入操作分析

插入流程

Java 模拟实现

数据结构定义

核心逻辑

1. 查找插入位置

2. 插入与分裂

3. 节点分裂

4. 中序遍历

完整测试

B-树的删除

变种对比：B+ 树与 B* 树

B+ 树

B* 树

总结

更多推荐文章

相关免费在线工具

更多推荐文章

相关免费在线工具

B-树原理与 Java 模拟实现详解

B-树基础概念

核心特性

插入操作分析

插入流程

Java 模拟实现

数据结构定义

核心逻辑

1. 查找插入位置

2. 插入与分裂

3. 节点分裂

4. 中序遍历

完整测试

B-树的删除

变种对比：B+ 树与 B* 树

B+ 树

B* 树

总结

B-树原理与 Java 模拟实现详解

B-树基础概念

核心特性

插入操作分析

插入流程

Java 模拟实现

数据结构定义

核心逻辑

1. 查找插入位置

2. 插入与分裂

3. 节点分裂

4. 中序遍历

完整测试

B-树的删除

变种对比：B+ 树与 B* 树

B+ 树

B* 树

总结

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具