B-tree Class


Topics:

Overview
B-tree Node Order
Memory Requirements
Concurrent B-tree Operations
Tuning Parameters
Conditional Directives
Constants
Type Definitions
Enumerations
Headers
Functions


Overview

A B-tree (Balance multi-way tree) is a special type of tree designed for fast disk-based insert, delete, and search operations. B-trees comprise a collection of multi-way nodes arranged in sort order. Multi-way nodes are used to store a left child pointer and multiple entry keys, with smaller keys on the left and larger keys on the right. Each entry key stores some type of data and right child pointer. The number of entry keys per node is determined by the node order. By definition B-trees are balanced, meaning that all nodes, except the root node, must be at least half full following an insertion or deletion. A single insertion can cause a number of splits all the way to the root node. The excessive node splitting that can occur following insertions and deletions complicate non-serialized concurrent operations.

The gxBtree class is a non-recursive disk-base B-tree used by various database applications for searching and sorting structured data and indexing associated objects or records stored in a data file. The gxBtree uses the 32/64-bit gxDatabase engine to perform all disk operations, supporting large files and proprietary file system interfaces. This is a non-recursive B-tree implementation designed to index any type of data regardless of the key type. It was originally designed to index multiple data types in read-only data files with 10M to 100M entries per index. Concurrent operations can be supported directly by serializing access to the entire tree or specific nodes during key insertions and key deletions. Additionally, concurrent operations can be supported using unbalanced deletions and file synchronization functions.

The gxBtree class uses three primary classes:

DatabaseKeyB - Abstract base class used to define database key types and the methods by which database keys are sorted. The key type and comparison operators are defined in the derived class allowing the B-tree to index multiple data types.

BtreeNode - B-tree node class used to manipulate a preset order of keys values in memory and on disk. B-tree nodes are stored on disk and manipulated in memory for fast disk-based operations.

BtreeStack - Stack used in place of the processor stack to store node addresses during non-recursive insertions and deletions. Eliminating recursive insertions and deletions allow a single B-tree to grow extremely large and maintain stability.


B-tree Node Order

The B-tree node order determines the size of node and will affect the overall performance of the B-tree depending on the key type. The B-tree node memory requirements and disk read/write time per node can be calculated by multiplying the node order minus one by the entry key size. The total size of an entry key is the size of the database key type plus the size of the right child pointer (4 bytes for 32-bit offsets or 8 bytes for 64-bit offsets). The total overhead per node is equal to the key count (4 bytes) plus the size of the node's left child pointer (4 bytes for 32-bit offsets or 8 bytes for 64-bit offsets).

NodeOverhead = KeyCount + LeftChildPointer
NodeSize = (NodeOrder-1 * (KeySize + RightChildPointer)) + NodeOverhead

The gxBtree class manipulates nodes by loading each node into memory and writing nodes back to disk each time changes are made to the node. If the key size if relatively large then a big node order will increase the memory requirements for the node and the disk read/write times per node. Take for examples a B-tree with a key type consisting of a 64 byte fixed string with a node order of 7. A node order of 7 means that each node can have a maximum of 7 children and 6 entry keys. In this example all nodes other then the root node must have a minimum of 3 keys in order to meet the B-tree balancing requirements. Changing the node order will effect the node size and the height of the tree. A smaller node order will increase the height of the tree but will be more efficient during batch insertions because the of the smaller node size.

Node order 7 = 416 bytes per node (32-bit offsets)
Node order 128 = 8644 bytes per node (32-bit offsets)
Node order 256 = 17348 bytes per node (32-bit offsets)

As the size of the key type increases a smaller node order will lessen the memory requirements per node and decrease the disk read/write times but increase the highest of the tree. As you work with the B-tree class you might want to start with a node order of 7 (with larger data keys) and adjust the node order and possibly the key size according to batch insertion times. The key size is determined by the key class derived from the abstract DatabaseKeyB class. This implementation allows the B-tree to operate independently of actual key type and allows you to determine how the keys are searched and sorted.


Memory Requirements

The gxBtree memory utilization is preset based on the node order, file offset size, and key size. The gxBtree class is a non-recursive B-tree implementation designed to support multi-threaded applications. Some of the original designs included a recursive B-tree, a cached recursive B-tree and a cached non-recursive B-tree. Although recursion is very fast it proved to be unstable when the B-tree grew extremely large and resulted in data loss that was very difficult to troubleshoot. Disk caching would seem the best method to increase B-tree disk performance but the performance increase was negligible compared to the amount of overhead your application would have to absorb. Disk caching also complicates concurrent operations that are by nature extremely difficult to synchronize in any B-tree implementation. The end result was the non-recursive B-tree that does not use disk caching by default. Since disk caching requires the B-tree nodes to be stored in memory caching works best on systems with several megabytes of available memory. On low memory systems disk caching degrades performance severely if virtual memory or swap space is required. In non-cached implementations the amount of RAM used depends on the tree height and file offset size (4 bytes for 32-bit offsets or 8 bytes for 64-bit offsets). Since the B-tree is non-recursive it uses an explicit stack to store file offsets. The number of file offsets stored during any insert or delete operation depends on the height of tree.

Since the actual nodes are not stored in memory nodes are read from disk as the stack unwinds. Although this decreases B-tree performance during batch insertions it makes supporting multithreaded applications more achievable. Plus disk caching can degrade performance when large entry keys are used. The non-cached implementation allows you to insert millions of nodes using both high and low end hardware without any data loss and provides extremely fast search times and a multithreaded infrastructure.


Concurrent B-tree Operations

Concurrent B-tree operations are extremely difficult to synchronize compared to an non-indexed data file. Locking individual records in an non-indexed data file is simple. The locking protocol allows a single thread to write lock a record and multiple threads to read lock a record. If a record is write locked then it cannot be read or modified until the thread holding the record releases it. If a record is read locked then up to 4 billion threads can read the record but the record cannot be modified until all the threads reading the record have released it. Applying a record locking protocol to a B-tree algorithm is far more complicated.

By definition a B-tree is blanched meaning all the nodes, except for the root node, must be at least half full following an insertion or deletion. A single insertion can result in several node splits and merges that propagate all the way to the root node, which may be re-grown. If you ease the balancing requires to support record locking the B-tree perform degrades to the point where is it unusable. The safest and most obvious solution is to serialize access to the entire B-tree during an insert or delete operation. A B-tree node-locking scheme involves finding the insertion or deletion point and calculating how many splits, merges, or rotations would occur and if any of those nodes are currently locked. If multiple nodes will be affected they could be write locked provided that another thread has not locked any of the nodes in the path of a balanced insertion or deletion. This may also result in every node in the B-tree being locked, which essentially locks the entire file. In the end this results in a tremendous amount of addition overhead and pointer manipulation. If you add node caching to the B-tree algorithm concurrent operations may not be obtainable without serializing access to the entire tree.

The non-recursive gxBtree implementation was originally designed to index very large data sets where the database was built by a single process and read by multiple threads. In order to optimize concurrent B-tree operations a lazy delete function and several built-in synchronization functions were added. The lazy delete function allows multiple threads to delete keys without balancing the tree and should only be used if the number of insertions heavily outweigh the number of deletions. The synchronization functions work by flushing any open disk buffers following an insert or delete and testing the B-tree header prior to performing an insert, delete, or find operation. The synchronization functions work best for multiple instances of single user applications. For example, an address book program where the user has launch two or three instances of the same application. To increase performance, speed wise, all the synchronization functions can be disabled.

A truly "thread safe" B-tree may ultimately be application dependent or bound to a specific data type. The gxBtree has an underlying infrastructure (file locking, node locking, lazy deletes, and synchronization functions) that allows it to support concurrent operations but it is not an out-of-the-box thread safe B-tree. The performance costs of serializing access to the entire B-tree can be offset by placing all threads in priority read and write queues. A more elaborate strategy would be to record lock the data file and file lock the indexing subsystem.


Tuning Parameters

You can increase the gxBtree's performance by modifying the 32/64-bit database engine and B-tree class. By default there are several flushing operations implemented to support multiple file access and reduce the possibility of data loss.

You can turn off the flushing operations and bit testing by changing the following default parameters to zero or setting the "flush" variable to zero before calling the function:

gxDatabaseError gxDatabase::Write(const void *buf, __ULWORD__ bytes, FAU file_address = gxCurrAddress, int flush = 1, int bit_test = 1);

int gxBtree::Insert(DatabaseKeyB &key, DatabaseKeyB &compare_key, int flush = 1);
int gxBtree::Delete(DatabaseKeyB &key, DatabaseKeyB &compare_key, int flush = 1);
int gxBtree::LazyDelete(DatabaseKeyB &key, DatabaseKeyB &compare_key, int flush = 1);

You can also eliminate some additional overhead by changing revision level:

gxDatabaseError gxDatabase::Create(const char *fname, FAU static_size =(FAU_t)0, __SBYTE__ RevisionLetter = gxDatabaseRevisionLetter);

The version letter will determine the amount of overhead per database block. The 32/64-bit database engine allows you to mix revision letters as needed. This feature was added specifically for the B-tree algorithm because the data sets can be so large that even the slightest bit of overhead could increase the file size tremendously. Revision A, B, and C add features to maintain optimum data protection. Revision 0 offers the least amount of overhead per block.

The B-tree search functions can be optimized by disabling tree testing when a search function is called:

int gxBtree::Find(DatabaseKeyB &key, DatabaseKeyB &compare_key, int test_tree = 1)
int gxBtree::FindFirst(DatabaseKeyB &key, int test_tree = 1);
int gxBtree::FindLast(DatabaseKeyB &key, int test_tree = 1) 
int gxBtree::FindNext(DatabaseKeyB &key, DatabaseKeyB &compare_key, int test_tree = 1)
int gxBtree::FindPrev(DatabaseKeyB &key, DatabaseKeyB &compare_key,int test_tree = 1)

The "test_tree" variable defaults to true to ensure data integrity during multiple file access.


Type Definitions

gxClassID - Persistent class ID type

gxClassID_t - Native class ID type

gxObjectID - Persistent object ID type

gxObjectID_t - Native object ID type

BtreeNodeOrder_t - Type used for the B-tree node order

BtreeSize_t - Type used to store entry key sizes

BtreeKeyCount_t - Type used for the B-tree node key count

BtreeKeyLocation_t - Type used for B-tree node key locations

Database engine type definitions


B-tree Headers

The B-tree header is a persistent data structure required to store tree parameters from one invocation of the program to the next. Each header is stored in a pre-allocated static storage area within the gxDatabase file. Multiple B-trees can be stored in a single file provided that a separate header exists for every tree.

struct gxBtreeHeader
{ 
  BtreeNodeOrder_t node_order; // B-tree node order
  BtreeSize_t key_size;        // Database key size
  BtreeSize_t n_keys;          // Total number of entry keys
  BtreeSize_t n_nodes;         // Total number of nodes
  BtreeSize_t btree_height;    // Height of this B-tree
  FAU root;                    // B-tree root address  
  gxClassID class_id;          // Optional class ID for B-tree object keys
};


Functions

gxBtree::gxBtree()
gxBtree::~gxBtree()
gxBtree::BtreeHeader()
gxBtree::BtreeHeight()
gxBtree::ClassID()
gxBtree::Close()
gxBtree::CollapseRoot()
gxBtree::Create()
gxBtree::DatabaseExceptionMessage()
gxBtree::Delete()
gxBtree::DeleteDatabaseKey()
gxBtree::Find()
gxBtree::FindFirst()
gxBtree::FindLast()
gxBtree::FindNext()
gxBtree::FindPrev()
gxBtree::Flush()
gxBtree::GetDatabaseError()
gxBtree::GrowNewRoot()
gxBtree::HeaderAddress()
gxBtree::HeaderInSync()
gxBtree::InitBtree()
gxBtree::Insert()
gxBtree::InsertBalance()
gxBtree::InsertDatabaseKey()
gxBtree::InsertLeftRotation()
gxBtree::InsertRightRotation()
gxBtree::IsEmpty()
gxBtree::KeySize()
gxBtree::LazyDelete()
gxBtree::LazyDeleteDatabaseKey()
gxBtree::Merge()
gxBtree::NodeOrder()
gxBtree::NumKeys()
gxBtree::NumNodes()
gxBtree::Open()
gxBtree::ReInit()
gxBtree::ReadBtreeHeader()
gxBtree::ReadNode()
gxBtree::Release()
gxBtree::ResetDatabaseError()
gxBtree::Root()
gxBtree::SetDatabaseError()
gxBtree::TestTree()
gxBtree::TotalNodeSize()
gxBtree::WriteBtreeHeader()
gxBtree::WriteNode()
gxBtree::gxDatabasePtr()

gxBtree::gxBtree(DatabaseKeyB &key_type, BtreeNodeOrder_t order,gxClassID cid = (gxClassID_t)-1) - Class constructor used to set the database entry key type and node order. The class ID is an optional parameter used by applications that require unique class IDs when indexing associated objects.

virtual gxBtree::~gxBtree() - Virtual class destructor responsible for closing the file and flushing any open disk buffers.

gxBtreeHeader *gxBtree::BtreeHeader() - Public member function that returns a pointer to B-tree header.

BtreeSize_t gxBtree::BtreeHeight() - Public member function that returns the height of the B-tree.

gxClassID gxBtree::ClassID() - Public member function that returns the class ID of this B-tree.

gxDatabaseError gxBtree::Close() - Public member function used to close the open B-tree file. Returns zero if successful or a non-zero value to indicate a failure.

gxDatabaseError gxBtree::CollapseRoot() - Protected member function used to collapse the current root node and shorten the tree. Returns zero if successful or a non-zero value to indicate a failure.

gxDatabaseError gxBtree::CollapseRoot(BtreeNode &root_node) - Protected member function used to collapse the specified root node after a read operation and shorten the tree. Returns zero if successful or a non-zero value to indicate a failure.

gxDatabaseError gxBtree::Create(const char *fname, int num_trees = 1) - Public member function used to create a new B-tree file, specifying the number of trees this index will contain. NOTE: This create function only initialize the first tree header and will always connect the tree to the first tree header. Returns zero if successful or a non-zero value to indicate a failure.

const char *gxBtree::DatabaseExceptionMessage() - Public member function that returns a null terminated string, which can be used to log or print the last reported database engine exception.

int gxBtree::Delete(DatabaseKeyB &key, DatabaseKeyB &compare_key, int flush = 1) - Public member function used to delete the specified key and balance the B-tree following the deletion. Returns true if successful or false if the key could not be deleted. By default all disk buffers will be flushed following a deletion to maintain synchronization during multiple file access. NOTE: The "DatabaseKeyB& compare_key" is a type cast from the derived key class back to a DatabaseKeyB base type and is required to perform comparisons because the comparison operators are not defined in the DatabaseKeyB base class.

gxDatabaseError gxBtree::DeleteDatabaseKey(DatabaseKeyB &key, DatabaseKeyB &compare_key) -Protected non-recursive balanced B-tree delete function used to ensure that all nodes, with the exception of the root node, are at least half full following a deletion. Returns zero if successful or a non-zero value to indicate a failure.

int gxBtree::Find(DatabaseKeyB &key, DatabaseKeyB &compare_key, int test_tree = 1) - Public member function used to find the specified key in the B-tree. Returns true if successful or false if the key could not be found. NOTE: The "DatabaseKeyB& compare_key" is a type cast from the derived key class back to a DatabaseKeyB base type and is required to perform comparisons because the comparison operators are not defined in the DatabaseKeyB base class.

int gxBtree::FindFirst(DatabaseKeyB &key, int test_tree = 1) - Public member function used to find the first key in the B-tree. If the "test_tree" variable is true a test is performed to ensure that the file data stays in sync during multiple file access. Returns true if successful or false if the key could not be found.

int gxBtree::FindLast(DatabaseKeyB &key, int test_tree = 1) - Public member function used to find the last key in the B-tree. If the "test_tree" variable is true a test is performed to ensure that the file data stays in sync during multiple file access. Returns true if successful or false if the key could not be found.

int gxBtree::FindNext(DatabaseKeyB &key, DatabaseKeyB &compare_key,int test_tree = 1) - Public member function used to find the next key after the specified key, passing back the key in the "key" variable. If the "test_tree" variable is true a test is performed to ensure that the file data stays in sync during multiple file access. Returns true if successful or false if the key could not be found. NOTE: The "DatabaseKeyB& compare_key" is a type cast from the derived key class back to a DatabaseKeyB base type and is required to perform comparisons because the comparison operators are not defined in the DatabaseKeyB base class.

int gxBtree::FindPrev(DatabaseKeyB &key, DatabaseKeyB &compare_key,int test_tree = 1) - Public member function used to find the previous key before the specified key, passing back the key in the "key" variable. If the "test_tree" variable is true a test is performed to ensure that the file data stays in sync during multiple file access. Returns true if successful or false if the key could not be found. NOTE: The "DatabaseKeyB& compare_key" is a type cast from the derived key class back to a DatabaseKeyB base type and is required to perform comparisons because the comparison operators are not defined in the DatabaseKeyB base class.

gxDatabaseError gxBtree::Flush() - Public member function used to write the B-tree header and flush any open disk buffers. Returns zero if successful or a non-zero value to indicate a failure.

gxDatabaseError gxBtree::GetDatabaseError() - Public member function the returns an error code representing the last reported database error.

gxDatabaseError gxBtree::GrowNewRoot(DatabaseKeyB &key) - Protected member function used to grow a new root node. Returns zero if successful or a non-zero value to indicate a failure.

FAU gxBtree::HeaderAddress() - Public member function that returns the file address of the B-tree header for this tree.

int gxBtree::HeaderInSync() - Public member function that returns true if the in-memory copy of the B-tree header is in sync with the disk copy.

gxDatabaseError gxBtree::InitBtree(gxDatabase *fptr, int create, FAU header_address) - Public member function used to connect the B-tree to the database engine and write a new B-tree header if the "create" variable is true. Returns zero if successful or a non-zero value to indicate a failure.

gxDatabaseError gxBtree::InitBtree(int create, FAU header_address) - Public member function used connect the B-tree to a header in the open database engine or write a new B-tree header if the "create" variable is true. Returns zero if successful or a non-zero value to indicate a failure.

int gxBtree::Insert(DatabaseKeyB &key, DatabaseKeyB &compare_key, int flush = 1) - Public member function used to insert the specified key. By default all disk buffers will be flushed following an insertion to maintain synchronization during multiple file access. NOTE: The "DatabaseKeyB& compare_key" is a type cast from the derived key class back to a DatabaseKeyB base type and is required to perform comparisons because the comparison operators are not defined in the DatabaseKeyB base class. Returns true if successful or false if the key could not be inserted.

gxDatabaseError gxBtree::InsertBalance(BtreeNode &parent_node, BtreeKeyLocation_t key_location, DatabaseKeyB &compare_key) - Protected member function used to balance this parent's sibling node after performing a deletion. Returns zero if successful or a non-zero value to indicate a failure.

gxDatabaseError gxBtree::InsertDatabaseKey(DatabaseKeyB &key, DatabaseKeyB &compare_key) - Protected non-recursive insertion function used to insert a key into the B-tree. Returns zero if successful or a non-zero value to indicate a failure.

gxDatabaseError gxBtree::InsertLeftRotation(BtreeNode &parent_node, BtreeKeyLocation_t key_location,DatabaseKeyB &compare_key) - Protected member function used by the gxBtree::InsertBalance() function to perform a left rotation. Returns zero if successful or a non-zero value to indicate a failure.

gxDatabaseError gxBtree::InsertRightRotation(BtreeNode &parent_node, BtreeKeyLocation_t key_location,DatabaseKeyB &compare_key) - Protected member function used by the gxBtree::InsertBalance() function to perform a right rotation. Returns zero if successful or a non-zero value to indicate a failure.

int gxBtree::IsEmpty() - Public member function that returns true if the B-tree is empty.

BtreeSize_t gxBtree::KeySize() - Public member function that returns the current key size.

int gxBtree::LazyDelete(DatabaseKeyB &key, DatabaseKeyB &compare_key, int flush = 1) - Public member function used to delete the specified key without balancing the tree following the deletion. Returns true if successful or false if the key could not be deleted. By default all disk buffers will be flushed following a deletion to maintain synchronization during multiple file access. NOTE: The "DatabaseKeyB& compare_key" is a type cast from the derived key class back to a DatabaseKeyB base type and is required to perform comparisons because the comparison operators are not defined in the DatabaseKeyB base class.

gxDatabaseError gxBtree::LazyDeleteDatabaseKey(DatabaseKeyB &key,DatabaseKeyB &compare_key) - Protected lazy (unbalanced) deletion method implemented specifically to support concurrent database operations that are effected by the excessive node splitting and merging that occurs during tree rotations. Returns zero if successful or a non-zero value to indicate a failure.

gxDatabaseError gxBtree::Merge(BtreeNode &parent_node, BtreeKeyLocation_t key_location,DatabaseKeyB &compare_key) - Protected member function used by the gxBtree::InsertBalance() function to merge the parents left and right siblings and remove the right sibling. Returns zero if successful or a non-zero value to indicate a failure.

BtreeNodeOrder_t gxBtree::NodeOrder() - Public member function that returns the node order for this B-tree

BtreeSize_t gxBtree::NumKeys() - Public member function that returns the total number entry keys stored in this B-tree.

BtreeSize_t gxBtree::NumNodes() - Public member function that returns the total number of B-tree nodes.

gxDatabaseError gxBtree::Open(const char *fname, gxDatabaseAccessMode mode = gxDBASE_READWRITE) - Public member function used to open an existing B-tree file. Returns zero if successful or a non-zero value to indicate a failure. NOTE: This function assumes that there is only one B-tree in this file.

gxDatabaseError gxBtree::Open(const char *fname, FAU header_address,gxDatabaseAccessMode mode = gxDBASE_READWRITE) - Public member function used to open an existing B-tree file, specifying the file address of the B-tree header. Returns zero if successful or a non-zero value to indicate a failure.

gxDatabaseError gxBtree::ReInit(int flush = 0) - Public member function used to reconnect the B-tree header to ensure that the file data stays in sync during multiple file access and multiple instances of the same application. If the "flush" variable is true, the file buffers will be flushed to disk before reconnecting the B-tree. Returns zero if successful or a non-zero value to indicate a failure.

gxDatabaseError gxBtree::ReInit(FAU header_address, int flush = 0) - Public member function used to reconnect the B-tree to a new tree header. If the "flush" variable is true, the file buffers will be flushed to disk before reconnecting the B-tree. Returns zero if successful or a non-zero value to indicate a failure.

gxDatabaseError gxBtree::ReadBtreeHeader() - Public member function used to Read the B-tree header from disk. Returns zero if successful or a non-zero value to indicate a failure.

gxDatabaseError gxBtree::ReadNode(BtreeNode &node, FAUnode_address = gxCurrAddress) - Public member function used to read a B-tree node from the specified file address. Returns zero if successful or a non-zero value to indicate a failure.

void gxBtree::Release() - Public member function used to reset the database file pointer to zero without closing the file or performing any other action. NOTE: This function is used to reset the file pointer when more then one object is referencing this file pointer and the file has been closed or the pointer has been deleted. You can also use the a Release() class to signal that this object is finished with the file and can be deleted without affecting other objects currently using this file.

gxDatabaseError gxBtree::ResetDatabaseError() - Public member function Public member function used to reset the last reported database engine error.

FAU gxBtree::Root() - Public member function that returns the file address of this B-tree's root node.

gxDatabaseError gxBtree::SetDatabaseError(gxDatabaseError err) - Public member function used to set the last reported database engine error. This function is used to inform the database engine of a fatal error condition. Redundantly returns the "err" value to allow this function to be used as a return value.

int gxBtree::TestTree(int reinit = 1) - Public member function used to ensure that the file data stays in sync during multiple file access. If the "reinit" variable is true this function will reconnect the B-tree header. Returns zero if no errors were encountered or an integer value corresponding to one of the following values:

1 = Node order error
2 = Key size error
3 = Number of keys error
4 = Number of nodes error
5 = B-tree height error
6 = B-tree root address error
7 = B-tree class ID error
8 = Error reading the B-tree header

size_t gxBtree::TotalNodeSize() - Public member function that returns the total size of a single Btree node, including overhead.

gxDatabaseError gxBtree::WriteBtreeHeader() - Public member function used to write the B-tree header to disk. Returns zero if successful or a non-zero value to indicate a failure.

gxDatabaseError gxBtree::WriteNode(const BtreeNode &node,FAU node_address = gxCurrAddress) - Public member function used to write a B-tree node to the specified file address. Returns zero if successful or a non-zero value to indicate a failure.

gxDatabase *gxBtree::gxDatabasePtr() - Public member function that returns a pointer to the database engine.


End Of Document