B-trees are
tree data structures that are most commonly found in
databases and
filesystem implementations. B-trees keep data sorted and allow amortized constant time insertion and deletions. Conceptually speaking B-trees grow from the bottom up as elements are inserted, whereas most
binary trees generally grow down.
The idea behind B-trees is that inner nodes can have a variable number of child nodes within some pre-defined range. This causes B-trees to not need re-balancing frequently, unlike AVL trees. The lower and upper bounds on the number of child nodes are fixed for a particluar implementation. For example, in a 2-3 B-tree (often simply 2-3 tree), each node may have only 2 or 3 child nodes. A node is considered to be in an illegal state if it has an invalid number of child nodes.
See also: Red-Black tree
Inner node structures
Generally speaking, the "separation values" can simply be the values of the tree.
Each inner node has separation values which divide its sub-trees. For example, if an inner node has 3 child nodes (or sub-trees) then it must have 2 separation values a1 and a2. All values less than a1 will be in the leftmost sub-tree, values between a1 and a2 will be in the middle sub-tree, and values greater than a2 will be in the rightmost sub-tree.
Steps for Deletion
- If after removing the desired node, no inner node is in an illegal state then the process is finished.
- If some inner node is in an illegal state then there are two possible cases:
- Its sibling node (a child of the same parent node) can transfer one of its child nodes to the current node and return it to a legal state. If so, after updating the separation values in the parent and the two siblings the operation ends.
- Its sibling does not have an extra child because it is on the lower bound too. In that case both these nodes are merged into a single node and the action is transferred to the parent node, since it has had a child node removed.
The process continues until the parent node remains in a legal state or until the root node is reached.
Steps for Insertion
- If after inserting the node into the appropriate position, no inner node is in an illegal state then the process is finished.
- If some node has more than the maximum amount of child nodes then it is split into two nodes, each with the minimum amount of child nodes. This process continues action recursivly in the parent node.
The action stops when either the node is in a legal state or the root is split into two nodes and a new root is inserted.
Searching
Searching is performed very similar to a binary tree search, simply by following the separation values until the value is found or the end of the tree is reached.
Notes
- The root node of the tree is a special node which can have from 2 to upper bound sons. (I'm not sure what is meant by this note, so I can't fix it.)(Added indented notes below to clarify.)
- The root node must obey the upper bound rule i.e. all nodes shall not have more than upper bound sons.
- The root node is exempt from the lower bound rule i.e. all non-root nodes shall have no fewer than lower bound sons.
- The root node's lower bound is 2 and shall have no fewer than 2 sons.
- A root node with 1 son is invalid and redundant i.e. information provided by a root node with 1 son is the same information provided by the son.
- A root node with no son is invalid.
- An empty tree is typically represented by the absence of the root node.
- Robert Tarjan proved that the amortized number of splits/merges is 2.
External Links
All Wikipedia text
is available under the
terms of the GNU Free Documentation License