LightDB: B-tree based simple and fast database C/C++ API library
The LightDB is a simple and farst database C/C++ API library based on the B-Tree(not B+-Tree) indexing.
The B-Tree is a self-balancing search tree having multiple nodes that contain multiple records.
All records in B-Tree are stored in sorted order.
For more informations about B-Tree,refer to any other appropriate documentations.
LightDB file can be more than 2GB in size on both 32-bit and 64-bit operating systems.
This software has been tested on Windows-10(32-bit&64-bit) and Linux(32-bit CentOS-5 & 64-bit CentOS-7).
Overview: Record(=Key+Data)
To construct LightDB database file,records must be added to the file.
A record consists of key and data fields.
Record:
The key field is used for sorting and searching.
Every key in the database must be unique. Any record having the same key in the database can not be added.
The data field is for the user,and never used by LightDB in search mechanithm.
Array length and types of key and data are defined by the user.
To keep B-Tree be balanced,restructuring of the tree is sometimes performed after record addition or deletion that is somewhat time-consuming. For this reason,data-length should be as short as possible.
It is sometimes good idea to store the data iteself to other place if it is substantially long,and set it's location to the data field instead of the real data itself with corresponding key,and add them to LightDB database.
Contents:
API Reference
All LightDB API functions return integer result code.
LdbOpen()
int LdbOpen( // opens database file
LDB_HANDLE *ph, // pointer to the handle necessary for all I/O operations done from now on
const char *szPath, // database file path to be opened
char chMode, // open mode
LDB_TYPE keyType, // key type
int keyArraySize, // key array size
KEY_COMP_FUNCTION *pKeyCompFunc, // function pointer for user defined key-compare function
LDB_TYPE dataType, // data type
int dataArraySize, // data array size
int nItem, // maximum number of items in a page(can be zero)
int nCache // maximum number of pages kept in computer memories(can be zero)
)
Notes:
chMode is an open mode character,and must be one of the following characters.
'R' ... opens existing file for read only mode.
'W' ... opens exisitng file for read and write mode.
'N' ... opens/creates non-existing file for read and write mode.
If the file is there,then the open fails.
'T' ... opens non-existing or existing file for read and write mode.
existing file is truncated.
r,w,n,t ... Lower case character is accepted.
nItem is a maximum number of items(keys) can be kept in a page.It must be even and be greater or equal to 8,otherwise it is computed by LightDB.
nCache is a maximum number of pages cached in memory.
nCache or nItem can be zero,in that case,LightDB will decide proper value.
pKeyCompFunc is an address of the key compare function provided by the user.
This is ignored if keyType!=T_UNDEFINED which means to use built-in key compare logic of LightDB.
To use user own key compare function,set keyType==T_UNDEFINED and pKeyCompFunc must be provided.
When keyType==T_UNDEFINED, then keyArraySize must be in bytes.
Note:
Key and Data are arrays of keyType and dataType respectively.
keyArraySize and dataArraySize are array (not byte) sizes of Key and Data respectively.
keyType,keySize,dataType,dataByteSize,and nItem are ignored when existing file is opened.
(values in the file are used instead).
LDB_HANDLE is the structure used in every LightDB API functions at the first argument defined as follows.
typedef struct _LDB_HANDLE
{
U_INT64 LdbID; /* After successful open,the content will be set to "LightDB" */
U_INT64 LdbObject; /* After successful open,C++ LightDB main class pointer is set. */
int LdbStatus; /* LDB API return value. */
int c_errno; /* C/C++ API error code if provided. */
} LDB_HANDLE;
For readability,some integer types are defined as follows.
/* 64 bit integer mainly used for BLOCK number or handles used in LightDB. */
#define S_INT16 short
#define U_INT16 unsigned short
#define S_INT32 int
#define U_INT32 unsigned int
#define S_INT64 long long int
#define U_INT64 unsigned long long int
To inform key and data types to LightDB,use following types
typedef enum
{
T_UBYTE8 = 1, /* 1 ... Unsigned char ( 8-bit) */
T_SBYTE8 = 2, /* 2 ... Signed char ( 8-bit) */
T_USHORT16 = 3, /* 3 ... Unsigned short (16-bit) */
T_SSHORT16 = 4, /* 4 ... Signed short (16-bit) */
T_UINT32 = 5, /* 5 ... Unsigned int (32-bit) */
T_SINT32 = 6, /* 6 ... Signed int (32-bit) */
T_UINT64 = 7, /* 7 ... Unsigned int (64-bit) */
T_SINT64 = 8, /* 8 ... Signed int (64-bit) */
T_FLOAT32 = 9, /* 9 ... Float (32-bit) */
T_DOUBLE64 = 10, /* 10 ... Double (64-bit) */
T_UNDEFINED= 11, /* 11 ... Undefined.
The user must provide key compare function
if this is specified for key type,
the size of T_UNDEFINED key/data must be specified in bytes.
*/
} LDB_TYPE;
All records(keys) are stored in the LightDB database in sorted order according to the key compare logic.
Built-in key compare logic of LightDB is just like the code bellow
int KeyCompare(void *pk1,void *pk2,int c,LDB_HANDLE *ph)
{
Key register *p1 = (Key *)pk1;
Key register *p2 = (Key *)pk2;
for(int register i=0;i<c;++i) {
if (p1[i]>p2[i]) return 1; /* pk1 > pk2 */
else if(p1[i]<p2[i]) return -1; /* pk1 < pk2 */
}
return 0; /* pk1 == pk2 */
}
Key and c is the key type and key array size respectively specified by the user.
Pointers to key and data are passed to LightDB API as void*,and they are casted to proper types when they are actually used.
The user can provide user specific key compare function.
To provide user specific key compare function,use following declaration,create simillar function listed above,and provide it to LdbOpen().
/* Key compare function called in the B-tree search process. */
typedef int (KEY_COMP_FUNCTION)(void *pk1,void *pk2,int cb,LDB_HANDLE *ph);
LDB_HANDLE *ph is the one specified at LdbOpen().
If you need some more info in the user defined key-compare function you can wrap LDB_HANDLE and use it like
typedef struct _MY_HANDLE { // User specific LDB_HANDLE
LDB_HANDLE ldb_handle; // Place the LDB_HANDLE here.
.......... // Place user specific informations from here.
} MY_HANDLE;
.....
int MyKeyCompFunc(void *p1,void *p2,int ck,LDB_HANDLE ph)
{
MY_HANDLE *mh = (MY_HANDLE*)ph;
.......
}
.......
MY_HANDLE mh;
LDB_HANDLE *ph = (LDB_HANDLE *)&mh;
nt e = LdbOpen(ph,"db-file path",'W',
T_UNDEFINED,keyByteSize,(KEY_COMP_FUNCTION *)MyKeyCompFunc,
dataType,dataArraySize,0,0);
LdbClose()
int LdbClose( // close LightDb file
LDB_HANDLE *ph // pointer to the LDB_HANDLE.
)
Notes:
Once opened,LightDB file must be closed.
Before closing the file,LightDB flushes all modified caches to the file.
LdbAddRecord()
int LdbAddRecord( // adds new record (pKey+pData) to the database.
LDB_HANDLE *ph, // pointer to the LDB_HANDLE.
void *pKey, // pointer to the key added.
void *pData // pointer to the data added.
);
Notes:
Add new record to the data base.
If the same key is already in the database,then this function fails.
Array size of pKey and pData must be greater than or equal to the size specified at LdbOpen().
(This array size assumption applies to other I/O functions except stated otherwise.)
To keep B-Tree be balanced,tree restructuring may happen.
LdbGetData()
int LdbGetData( // retrieves the data(pData) paired with the specified key(pKey).
LDB_HANDLE *ph, // pointer to the LDB_HANDLE.
void *pKey, // user specified key to obtain corresponding data.
void *pData // pointer to the data area retrieved.
);
Notes:
Find the record having the key(pKey) and copy it's data to pData.
LdbChangeData()
int LdbChangeData( // changes the data field having the key(pKey) to pData.
LDB_HANDLE *ph, // pointer to the LDB_HANDLE.
void *pKey, // pointer to the key searched.
void *pData // pointer to the data changed.
);
Notes:
Find the record having the key(pKey) and change it's data to pData.
LdbWriteRecord()
int LdbWriteRecord( // finds the key and change paired data,or add key and data if not found.
LDB_HANDLE *ph, // pointer to the LDB_HANDLE.
void *pKey, // pointer to the key searched.
void *pData // pointer to the data changed or added.
);
Notes:
Find the record having the key(pKey) and change the data to pData.
If the key is not found,then the record specified is added to the database.
LdbDeleteRecord()
int LdbDeleteRecord( // deletes the record having the key specified by pKey.
LDB_HANDLE *ph, // pointer to the LDB_HANDLE.
void *pKey // pointer to the key to identify the record deleted.
);
Notes:
Delete the record having the key(pKey) from the data base.
The record pointer is undefined after successful deletion.
To keep B-Tree be balanced,tree restructuring may happen.
LdbGetMinRecord()
int LdbGetMinRecord( // retrieves the record having the minimum key.
LDB_HANDLE *ph, // pointer to the LDB_HANDLE.
void *pKey, // the minimum key retrieved.
void *pData // the data paired with the minimum key.
);
Notes:
Retrieve minimum key record.
LdbGetMaxRecord()
int LdbGetMaxRecord( // retrieves the record having the maximum key.
LDB_HANDLE *ph, // pointer to the LDB_HANDLE.
void *pKey, // the maximum key retrieved.
void *pData // the data paired with the maximum key.
);
Notes:
Retrieve maximum key record.
LdbGetNextRecord()
int LdbGetNextRecord( // returns the next record from the current record pointer in sorted order.
LDB_HANDLE *ph, // pointer to the LDB_HANDLE.
void *pKey, // the next key retrieved.
void *pData // the data paired with the next key.
);
Notes:
Retrieve next record from current record position.
LdbGetNextMinRecord()
int LdbGetNextMinRecord( // retrieves the minimum key or next key.
LDB_HANDLE *ph, // pointer to the LDB_HANDLE.
void *pKey, // pointer to the key retrieved.
void *pData, // pointer to the data retrieved.
int fGetMin // flag. see Notes bellow.
);
Notes:
Retrieve minimum key record if fGetMin != 0,otherwise the same as LdbGetNextRecord().
LdbGetPrevRecord()
int LdbGetPrevRecord( // returns the previous record from the current record pointer in sorted order.
LDB_HANDLE *ph, // pointer to the LDB_HANDLE.
void * pKey, // pointer to the key retrieved.
void *pData // pointer to the data retrieved.
);
Notes:
Retrieve previous record from current record position.
LdbGetPrevMaxRecord()
int LdbGetPrevMaxRecord( // retrieves the maximum key or previous key.
LDB_HANDLE *ph, // pointer to the LDB_HANDLE.
void *pKey, // pointer to the key retrieved.
void *pData, // pointer to the data retrieved.
int fGetMax // flag. see Notes bellow.
);
Notes:
Retrieve maximum key record if fGetMax != 0,otherwise the same as LdbGetPrevRecord().
LdbGetCurRecord()
int LdbGetCurRecord( // retieves the record at the current record pointer.
LDB_HANDLE *ph, // pointer to the LDB_HANDLE.
void *pKey, // pointer to the key retrieved.
void *pData // pointer to the data retrieved.
);
Notes:
Get record at the current record position.
LdbChangeCurData()
int LdbChangeCurData( // changes data at the current record pointer.
LDB_HANDLE *ph, // pointer to the LDB_HANDLE.
void *pData // pointer to the data to change.
);
Notes:
Change the current data.
LdbCompareKeys()
int LdbCompareKeys( // cpmpares two keys and returns the result(0 if pKey1==pKey2,1 if pKey1 > pKey2,-1 if pKey1 < pKey2).
LDB_HANDLE *ph, // pointer to the LDB_HANDLE.
int *pf, // pointer to the return value.
void *pKey1, // pointer to the first key.
void *pKey2 // pointer to the second key.
);
Notes:
Compares 2 keys and returns result to *pf.
For key-compare logic,see Key-compare logic.
LdbFlush()
int LdbFlush( // flushes every modified block back to the LightDb file opened.
LDB_HANDLE *ph // LightDB handle.
);
Notes:
Number of blocks kept in memories is specified at LdbOpen().
LdbVerifyContents()
int LdbVerifyContents( // checks the validity of the database(but does not repair anything bad).
LDB_HANDLE *ph, // LightDB handle.
int detail // output level.
);
Notes:
Check the validity of the LightDb file.
detail ... 0: No output except for Warning or Error messages.
1: Print header informations.
>1: More informations.
LdbUserAreaIO()
int LdbUserAreaIO( // reads or writes user information from or to the control area.
LDB_HANDLE *ph, // LightDB handle.
void *pData, // pointer to the user information area.
int cb, // number of bytes to read or write.
char chIo // I/O flag.
);
Notes:
User control area I/O function.
chIo ... 'W' or 'w' : write cb bytes of *pData to user area.
'R' or 'r' : read cb byte to *pData from user area.
The maximum value of cb available can be obtained by calling LdbGetInfo().
LdbGetInfo()
int LdbGetInfo( // obtains LightDB header informations.
LDB_HANDLE *ph, // LightDB handle.
LDB_INFO *pInfo // pointer to the area to receive informations.
);
Notes:
Get database informations to the struct LDB_INFO listed bellow.
If pInfo is NULL,then infomations are printed to the console.
typedef struct _LDB_INFO {
U_INT64 Version; /* Version number */
U_INT64 TotalPages; /* Total pages in the data base. */
U_INT64 RootPage; /* Root page's block number */
U_INT64 DeletedPage; /* Top of the deleted page chain */
unsigned int PageByteSize; /* Page size in bytes */
unsigned int KeyByteSize; /* Key size in bytes */
unsigned int DataByteSize; /* Data size in bytes */
unsigned int UserAreaByteSize; /* User area size in bytes */
unsigned int MaxItems; /* Max numer of items in a page. */
unsigned int CachedPages; /* Number of pages(blocks) cached */
LDB_TYPE KeyType; /* Key type. */
LDB_TYPE DataType; /* Data type. */
} LDB_INFO;
LdbGetMsg()
char *LdbGetMsg( // returns the pointer to the string describing the error/warning code e.
int e // e must be one of the code listed bellow
);
Notes:
Returns pointer to the string describing the meaning of return code of e,
where:
Negative values mean fatal errors.
Positive values(warnings) are not always fatal.
Most LightDB API returns value 0 on success(normal case), or returns one of the value listed bellow. The returned value is also stored in LDB_HANDLE member(LdbStatus).
c_error member of LDB_HANDLE struct may have more information(through strerror(c_error))
Error/Warning codes
error code | value | description |
ERROR_BAD_BLOCK_NUMBER | -1 | Error: bad block number specified. |
ERROR_BAD_ITEM_NUMBER | -2 | Error: bad item number specified. |
ERROR_DELETED_BLOCK | -3 | Error: deleted block number specified. |
ERROR_MEMORY_ALLOC | -10 | Error: Memory allocation error. |
ERROR_FILE_DUPLICATED_OPEN | -20 | Error: Already opened(Close() before Open()). |
ERROR_FILE_OPEN | -30 | Error: Failed to open specified file. |
ERROR_FILE_OPEN_MODE | -40 | Error: Invalid open mode. |
ERROR_FILE_BAD_BLOCKSIZE | -50 | Error: Invalid block size. |
ERROR_FILE_SEEK | -60 | Error: File seek failed(System I/O error). |
ERROR_FILE_READ | -70 | Error: File read failed(System I/O error). |
ERROR_FILE_WRITE | -80 | Error: File write failed(System I/O error). |
ERROR_FILE_ACCESS_DENIED | -90 | Error: I/O error or write on read only file. |
ERROR_CACHE_TOO_SMALL | -100 | Error: Cache size specified too small. |
ERROR_BUFFER_OVERRUN | -110 | Error: Buffer overrun. |
ERROR_KEY_SIZE | -120 | Error: Bad key size. |
ERROR_KEY_TYPE | -130 | Error: Bad key type. |
ERROR_DATA_SIZE | -140 | Error: Bad data size. |
ERROR_DATA_TYPE | -150 | Error: Bad data type. |
ERROR_ITEMS_NOT_EVEN | -160 | Error: Number of items in a page must be even. |
ERROR_BAD_PAGE_NO | -170 | Error: Page number must not be zero. |
ERROR_BAD_ARGUMENT | -180 | Error: Bad or unacceptable argument specified. |
ERROR_BROKEN_FILE | -200 | Error: File is broken or non lightdb file. |
ERROR_BAD_PTR | -210 | Error: Bad pointer(including NULL). |
ERROR_SYSTEM | -220 | Error: System mulfunction. |
warning code | value | description |
SAME_KEY_EXISTS | 10 | Warning: Same key alredy in the data base. |
KEY_NOT_FOUND | 20 | Warning: Specified key not in the data base. |
NO_MORE_KEY | 30 | Warning: No more keys specified in the data base. |
NO_KEYS | 40 | Warning: Specified keys not in the data base. |
NO_KEY_SELECTED | 50 | Warning: No key selected. |
Record pointer
As every record is stored in certain page at certain position in sorted order,
any record can directly be accessed by specifying the proper position and the page number holding it.
LightDB internally keeps both the record position and the page number called as 'record pointer' which is the location at the key-lookup ends.
After successful key search,the record pointer is located at the position of the key searched.
After unsuccessful key search,the record pointer is located at the position where it is inserted if addition is performed.
The record pointer consists of both the page number(64-bit integer) and the record position(32-bit integer).
Note that successful record deletion makes the record pointer be undefined(the page number is zero).
Also,care must be paid that successful addition or deletion of a record affects other records' position due to keep the B-Tree balanced by restructuring.
Low level I/O functions
LightDB offers the record pointer,and low level I/O functions to access any record without searching by specifying the record pointer.
LdbGetCurrentPTR()
int LdbGetCurrentPTR( // returns current record pointer.
LDB_HANDLE *ph, // pointer to the LightDB handle necessary for all I/O operations.
U_INT64 *piPage, // pointer to the current page number returned.
int *piRecord // pointer to the record position returned.
);
Notes:
*piPage==0 means that the current record pointer is undefined.
The record pointer is undefined(Page number of it is 0) after successful record deletion or just after open.
LdbGetRecord()
int LdbGetRecord( // directly reads record by specifying record pointer(iPage,iRecord).
LDB_HANDLE *ph, // pointer to the LightDB handle necessary for all I/O operations.
U_INT64 iPage, // page number to read record.
int iRecord, // record position in the page iPage.
void *pKey, // pointer to the key field read.
void *pData // pointer to the data field read.
);
Notes:
The record pointer(iPage,iRecord) can be obtained by LdbGetCurrentPTR().
LdbSetData()
int LdbSetData( // sets/changes the data field to pData at the iRecord of the page iPage.
LDB_HANDLE *ph, // LightDB handle.
U_INT64 iPage, // the page number to change the data.
int iRecord, // the record position of which data is changed in the page iPage.
void *pData // pointer to the data to change.
);
Notes:
For further details about the record pointer,see About record pointer.
LdbGetRootPage()
int LdbGetRootPage( // returns root page's page number.
LDB_HANDLE *ph, // pointer to the LightDB handle.
U_INT64 *piPage // pointer to the page number returned.
);
Notes:
If returned value (*piPage) is zero,then this means the database empty.
LdbGetRecordCount()
int LdbGetRecordCount( // returns number of records stored in the page specified by iPage.
LDB_HANDLE *ph, // LightDB handle.
U_INT64 iPage, // page number to obtain record count.
int *pc // pointer to the record count returned.
);
Notes:
The maximum number of record(=mr) is specified or decided by LightDB at LbbOpen().
Record count returned(=nr) must be mr/2 <= nr <=mr.
LdbGetRightPage()
int LdbGetChildPage( // returns the right child page of the iRecord in iPage.
LDB_HANDLE *ph, // LightDB handle.
U_INT64 iPage, // page number having iRecord.
int iRecord, // record position having right child page.
U_INT64 *pChildPage // the right page's page number received.
);
Notes:
iRecord can be -1 for the left child page.
*pChildPage==0 if the iPage is leaf.
LightDB database file structure
LightDB file consists of equally sized blocks.
LightDB file:
block[0] | block[1] | .......... | block[nb-1] |
LightDB file is devided into equally sized blocks.
LightDB file size = (block size)*nb.
LightDB I/O is done per each block.
Blocks are cached in memory for performance reason.
where:
- nb is the total number of blocks in the LightDB file.
- Every block(page) is identified by 64-bit long integer called block number(0 - nb-1)
- block[0] contains database and user specified informations.
- block[1] - block[nb-1] are B-Tree nodes(active) or deleted block(node).
- B-Tree nodes(pages) consist of one root node,non leaf nodes,and leaf nodes.
- B-Tree has only one root node.
- Any key search begins fron the root page.
- Pages(including root page) having child pages are called non-leaf.
- Pages having no children are called leaf.
In this documentation,block,node and page are virtually the same.
Block[0]
Block[0]:
LightDB information area | User specific area |
Block 0,the top block of the database file, is the block for database informations.
Informations such as key type,key length,data type,data length,block number of the root page,...etc.
User specific informations can also be saved in this block.
LightDB information area: //
// The first part of the block 0 is the BLOCK_HEADER followed by the PAGE_HEADER.
//
typedef struct _BLOCK_HEADER {
U_INT64 ID; // == "LightDB\0",ASCII string.
U_INT64 Version; // Currently 2.
U_INT64 TotalBocks; // Total number of blocks in the data base.
U_INT64 TopDeleted; // Top of the deleted page link. 0 is the final block deleted.
// The highest bit of the top 8-byte of any deleted block is set to 1
// (Page number | 0x8000000000000000LL).
U_INT64 BlockByteSize; // The size of a block in bytes.
U_INT64 DeletedBlocks; // Total number of the deleted blocks.
} BLOCK_HEADER;
typedef struct _PAGE_HEADER {
U_INT64 RootPage; // Root page's block number.
U_INT64 TotalRecords; // Total number of records in the data base.
S_INT32 KeyType; // Key type.
U_INT32 KeyByteSize; // Key length in bytes.
S_INT32 DataType; // Data type.
U_INT32 DataByteSize; // Data length in bytes.
S_INT32 MaxItems; // Max numer of items(records) in a page.
U_INT32 Dummy32; // reserved for future use(from here to the user specific area is reserved.)
} PAGE_HEADER;
User specific area:
User specific area ends at the end of the block 0,but the size is defined at LdbOpen() stage.
Refer to LdbGetInfo() for size and LdbUserAreaIO() for I/O respectibly.
Block[1] - [nb-1]
From block[1] to block[nb-1] are B-Tree pages having multiple records(keys+data) or empty pages all of which records are deleted.
All active pages are linked each other to construct B-Tree structure.
All deleted(empty) pages are also linked as deleted block chain.
Page
Every active page in the B-Tree has the pointer(=block number) to its parent.
But the root page has no parent(=parent block number is 0).
All records associated with their right child pointer(=Item) in a page are stored in sorted order.
A Record with it's right child pointer is called an item.
A page can have mi items at most defined at LdbOpen().
Active page:
Page number of this page |
Parent page number |
left child page number |
ni(64-bit) |
Item[0] |
Item[1] |
...... |
Item[ni-1] |
- mi/2 <= ni <= mi, where ni is the number of keys the page currently has(the root page can have items less than mi/2).
- mi,specified by the user or defined by LightDB, must be an even number greater than or eqaul to 8.
- The key of item[i] < All right child's keys < the key of item[i+1].
- All keys in the left child page < the key in item[0].
- Any page can have ni+1(left child + ni right children) children.
- All child page numbers in any leaf page are zero.
Deleted(empty) page:
Block number to the next deleted page |
undefined ...... |
- Deleted block is reused before extending the file size when new record is added.
- The highest bit of the deleted block number is set to 1 (Page number | 0x8000000000000000LL).
- The value of the next deleted page at the final block chain is zero.
- The top block of the deleted block chain is in BLOCK_HEADER.
Item[i]:
right child page number | record(key[i]+data) |
- key[i] < right child page's keys < key[i+1]
: where key[i] is the key in the item[i].
Shigeo Kobayashi 2018-10-25