Tkrzw
|
File database manager implementation based on B+ tree. More...
#include <tkrzw_dbm_tree.h>
Classes | |
class | Iterator |
Iterator for each record. More... | |
struct | TuningParameters |
Tuning parameters for the database. More... | |
Public Types | |
enum | PageUpdateMode : int32_t { PAGE_UPDATE_DEFAULT = 0 , PAGE_UPDATE_NONE = 1 , PAGE_UPDATE_WRITE = 2 } |
Enumeration for page update modes. More... | |
Public Types inherited from tkrzw::DBM | |
typedef std::function< std::string_view(std::string_view, std::string_view)> | RecordLambdaType |
Lambda function type to process a record. More... | |
Public Member Functions | |
TreeDBM () | |
Default constructor. More... | |
TreeDBM (std::unique_ptr< File > file) | |
Constructor with a file object. More... | |
~TreeDBM () | |
Destructor. More... | |
TreeDBM (const TreeDBM &rhs)=delete | |
Copy and assignment are disabled. More... | |
TreeDBM & | operator= (const TreeDBM &rhs)=delete |
Status | Open (const std::string &path, bool writable, int32_t options=File::OPEN_DEFAULT) override |
Opens a database file. More... | |
Status | OpenAdvanced (const std::string &path, bool writable, int32_t options=File::OPEN_DEFAULT, const TuningParameters &tuning_params=TuningParameters()) |
Opens a database file, in an advanced way. More... | |
Status | Close () override |
Closes the database file. More... | |
Status | Process (std::string_view key, RecordProcessor *proc, bool writable) override |
Processes a record with a processor. More... | |
Status | ProcessMulti (const std::vector< std::pair< std::string_view, DBM::RecordProcessor * >> &key_proc_pairs, bool writable) override |
Processes multiple records with processors. More... | |
Status | ProcessFirst (RecordProcessor *proc, bool writable) override |
Processes the first record with a processor. More... | |
Status | ProcessEach (RecordProcessor *proc, bool writable) override |
Processes each and every record in the database with a processor. More... | |
Status | Count (int64_t *count) override |
Gets the number of records. More... | |
Status | GetFileSize (int64_t *size) override |
Gets the current file size of the database. More... | |
Status | GetTimestamp (double *timestamp) override |
Gets the timestamp in seconds of the last modified time. More... | |
Status | GetFilePath (std::string *path) override |
Gets the path of the database file. More... | |
Status | Clear () override |
Removes all records. More... | |
Status | Rebuild () override |
Rebuilds the entire database. More... | |
Status | RebuildAdvanced (const TuningParameters &tuning_params=TuningParameters(), bool skip_broken_records=false, bool sync_hard=false) |
Rebuilds the entire database, in an advanced way. More... | |
Status | ShouldBeRebuilt (bool *tobe) override |
Checks whether the database should be rebuilt. More... | |
Status | Synchronize (bool hard, FileProcessor *proc=nullptr) override |
Synchronizes the content of the database to the file system. More... | |
std::vector< std::pair< std::string, std::string > > | Inspect () override |
Inspects the database. More... | |
bool | IsOpen () const override |
Checks whether the database is open. More... | |
bool | IsWritable () const override |
Checks whether the database is writable. More... | |
bool | IsHealthy () const override |
Checks whether the database condition is healthy. More... | |
bool | IsAutoRestored () const |
Checks whether the database has been restored automatically. More... | |
bool | IsOrdered () const override |
Checks whether ordered operations are supported. More... | |
std::unique_ptr< DBM::Iterator > | MakeIterator () override |
Makes an iterator for each record. More... | |
std::unique_ptr< DBM > | MakeDBM () const override |
Makes a new DBM object of the same concrete class. More... | |
UpdateLogger * | GetUpdateLogger () const override |
Gets the logger to write all update operations. More... | |
void | SetUpdateLogger (UpdateLogger *update_logger) override |
Sets the logger to write all update operations. More... | |
File * | GetInternalFile () const |
Gets the pointer to the internal file object. More... | |
int64_t | GetEffectiveDataSize () |
Gets the effective data size. More... | |
int32_t | GetDatabaseType () |
Gets the database type. More... | |
Status | SetDatabaseType (uint32_t db_type) |
Sets the database type. More... | |
std::string | GetOpaqueMetadata () |
Gets the opaque metadata. More... | |
Status | SetOpaqueMetadata (const std::string &opaque) |
Sets the opaque metadata. More... | |
KeyComparator | GetKeyComparator () const |
Gets the comparator of record keys. More... | |
Status | ValidateHashBuckets () |
Validate all buckets in the hash table. More... | |
Status | ValidateRecords (int64_t record_base, int64_t end_offset) |
Validates records in a region. More... | |
Public Member Functions inherited from tkrzw::DBM | |
virtual | ~DBM ()=default |
Destructor. More... | |
virtual Status | Process (std::string_view key, RecordLambdaType rec_lambda, bool writable) |
Processes a record with a lambda function. More... | |
virtual Status | Get (std::string_view key, std::string *value=nullptr) |
Gets the value of a record of a key. More... | |
virtual std::string | GetSimple (std::string_view key, std::string_view default_value="") |
Gets the value of a record of a key, in a simple way. More... | |
virtual Status | GetMulti (const std::vector< std::string_view > &keys, std::map< std::string, std::string > *records) |
Gets the values of multiple records of keys, with a string view vector. More... | |
virtual Status | GetMulti (const std::initializer_list< std::string_view > &keys, std::map< std::string, std::string > *records) |
Gets the values of multiple records of keys, with an initializer list. More... | |
virtual Status | GetMulti (const std::vector< std::string > &keys, std::map< std::string, std::string > *records) |
Gets the values of multiple records of keys, with a string vector. More... | |
virtual Status | Set (std::string_view key, std::string_view value, bool overwrite=true, std::string *old_value=nullptr) |
Sets a record of a key and a value. More... | |
virtual Status | SetMulti (const std::map< std::string_view, std::string_view > &records, bool overwrite=true) |
Sets multiple records, with a map of string views. More... | |
virtual Status | SetMulti (const std::initializer_list< std::pair< std::string_view, std::string_view >> &records, bool overwrite=true) |
Sets multiple records, with an initializer list. More... | |
virtual Status | SetMulti (const std::map< std::string, std::string > &records, bool overwrite=true) |
Sets multiple records, with a map of strings. More... | |
virtual Status | Remove (std::string_view key, std::string *old_value=nullptr) |
Removes a record of a key. More... | |
virtual Status | RemoveMulti (const std::vector< std::string_view > &keys) |
Removes records of keys, with a string view vector. More... | |
virtual Status | RemoveMulti (const std::initializer_list< std::string_view > &keys) |
Removes records of keys, with an initializer list. More... | |
virtual Status | RemoveMulti (const std::vector< std::string > &keys) |
Removes records of keys, with a string vector. More... | |
virtual Status | Append (std::string_view key, std::string_view value, std::string_view delim="") |
Appends data at the end of a record of a key. More... | |
virtual Status | AppendMulti (const std::map< std::string_view, std::string_view > &records, std::string_view delim="") |
Appends data to multiple records, with a map of string views. More... | |
virtual Status | AppendMulti (const std::initializer_list< std::pair< std::string_view, std::string_view >> &records, std::string_view delim="") |
Appends data to multiple records, with an initializer list. More... | |
virtual Status | AppendMulti (const std::map< std::string, std::string > &records, std::string_view delim="") |
Appends data to multiple records, with a map of strings. More... | |
virtual Status | CompareExchange (std::string_view key, std::string_view expected, std::string_view desired, std::string *actual=nullptr, bool *found=nullptr) |
Compares the value of a record and exchanges if the condition meets. More... | |
virtual Status | Increment (std::string_view key, int64_t increment=1, int64_t *current=nullptr, int64_t initial=0) |
Increments the numeric value of a record. More... | |
virtual int64_t | IncrementSimple (std::string_view key, int64_t increment=1, int64_t initial=0) |
Increments the numeric value of a record, in a simple way. More... | |
virtual Status | ProcessMulti (const std::vector< std::pair< std::string_view, RecordProcessor * >> &key_proc_pairs, bool writable)=0 |
Processes multiple records with processors. More... | |
virtual Status | ProcessMulti (const std::vector< std::pair< std::string_view, RecordLambdaType >> &key_lambda_pairs, bool writable) |
Processes multiple records with lambda functions. More... | |
virtual Status | CompareExchangeMulti (const std::vector< std::pair< std::string_view, std::string_view >> &expected, const std::vector< std::pair< std::string_view, std::string_view >> &desired) |
Compares the values of records and exchanges if the condition meets. More... | |
virtual Status | Rekey (std::string_view old_key, std::string_view new_key, bool overwrite=true, bool copying=false, std::string *value=nullptr) |
Changes the key of a record. More... | |
virtual Status | ProcessFirst (RecordLambdaType rec_lambda, bool writable) |
Processes the first record with a lambda function. More... | |
virtual Status | PopFirst (std::string *key=nullptr, std::string *value=nullptr) |
Gets the first record and removes it. More... | |
virtual Status | PushLast (std::string_view value, double wtime=-1, std::string *key=nullptr) |
Adds a record with a key of the current timestamp. More... | |
virtual Status | ProcessEach (RecordLambdaType rec_lambda, bool writable) |
Processes each and every record in the database with a lambda function. More... | |
virtual int64_t | CountSimple () |
Gets the number of records, in a simple way. More... | |
virtual int64_t | GetFileSizeSimple () |
Gets the current file size of the database, in a simple way. More... | |
virtual std::string | GetFilePathSimple () |
Gets the path of the database file, in a simple way. More... | |
virtual double | GetTimestampSimple () |
Gets the timestamp of the last modified time, in a simple way. More... | |
virtual bool | ShouldBeRebuiltSimple () |
Checks whether the database should be rebuilt, in a simple way. More... | |
virtual Status | CopyFileData (const std::string &dest_path, bool sync_hard=false) |
Copies the content of the database file to another file. More... | |
virtual Status | Export (DBM *dest_dbm) |
Exports all records to another database. More... | |
const std::type_info & | GetType () const |
Gets the type information of the actual class. More... | |
Static Public Member Functions | |
static Status | ParseMetadata (std::string_view opaque, int64_t *num_records, int64_t *eff_data_size, int64_t *root_id, int64_t *first_id, int64_t *last_id, int64_t *num_leaf_nodes, int64_t *num_inner_nodes, int32_t *max_page_size, int32_t *max_branches, int32_t *tree_level, int32_t *key_comp_type, std::string *mini_opaque) |
Parses metadata on an opaque data sequence. More... | |
static Status | RestoreDatabase (const std::string &old_file_path, const std::string &new_file_path, int64_t end_offset, std::string_view cipher_key="") |
Restores a broken database as a new healthy database. More... | |
Static Public Attributes | |
static constexpr int32_t | DEFAULT_OFFSET_WIDTH = 4 |
The default value of the offset width. More... | |
static constexpr int32_t | DEFAULT_ALIGN_POW = 10 |
The default value of the alignment power. More... | |
static constexpr int64_t | DEFAULT_NUM_BUCKETS = 131101 |
The default value of the number of buckets. More... | |
static constexpr int32_t | DEFAULT_FBP_CAPACITY = 2048 |
The default value of the capacity of the free block pool. More... | |
static constexpr int32_t | DEFAULT_MAX_PAGE_SIZE = 8130 |
The default value of the max page size. More... | |
static constexpr int32_t | DEFAULT_MAX_BRANCHES = 256 |
The default value of the max branches. More... | |
static constexpr int32_t | DEFAULT_MAX_CACHED_PAGES = 10000 |
The default value of the maximum number of cached pages. More... | |
static constexpr int32_t | OPAQUE_METADATA_SIZE = 10 |
The size of the opaque metadata. More... | |
Static Public Attributes inherited from tkrzw::DBM | |
static const std::string_view | ANY_DATA |
The special string_view value to represent any data. More... | |
File database manager implementation based on B+ tree.
All operations are thread-safe; Multiple threads can access the same database concurrently. Every opened database must be closed explicitly to avoid data corruption.
enum tkrzw::TreeDBM::PageUpdateMode : int32_t |
tkrzw::TreeDBM::TreeDBM | ( | ) |
Default constructor.
MemoryMapParallelFile is used to handle the data.
|
explicit |
Constructor with a file object.
file | The file object to handle the data. The ownership is taken. |
tkrzw::TreeDBM::~TreeDBM | ( | ) |
Destructor.
|
explicitdelete |
Copy and assignment are disabled.
|
overridevirtual |
Opens a database file.
path | A path of the file. |
writable | If true, the file is writable. If false, it is read-only. |
options | Bit-sum options of File::OpenOption enums for opening the file. |
Precondition: The database is not opened.
Implements tkrzw::DBM.
Status tkrzw::TreeDBM::OpenAdvanced | ( | const std::string & | path, |
bool | writable, | ||
int32_t | options = File::OPEN_DEFAULT , |
||
const TuningParameters & | tuning_params = TuningParameters() |
||
) |
Opens a database file, in an advanced way.
path | A path of the file. |
writable | If true, the file is writable. If false, it is read-only. |
options | Bit-sum options for opening the file. |
tuning_params | A structure for tuning parameters. |
Precondition: The database is not opened.
|
overridevirtual |
Closes the database file.
Precondition: The database is opened.
Implements tkrzw::DBM.
|
overridevirtual |
Processes a record with a processor.
key | The key of the record. |
proc | The pointer to the processor object. |
writable | True if the processor can edit the record. |
Precondition: The database is opened. The writable parameter should be consistent to the open mode.
If the specified record exists, the ProcessFull of the processor is called. Otherwise, the ProcessEmpty of the processor is called.
Implements tkrzw::DBM.
|
override |
Processes multiple records with processors.
key_proc_pairs | Pairs of the keys and their processor objects. |
writable | True if the processors can edit the records. |
Precondition: The database is opened. The writable parameter should be consistent to the open mode.
If the specified record exists, the ProcessFull of the processor is called. Otherwise, the ProcessEmpty of the processor is called.
|
overridevirtual |
Processes the first record with a processor.
proc | The pointer to the processor object. |
writable | True if the processor can edit the record. |
Precondition: The database is opened. The writable parameter should be consistent to the open mode.
If the first record exists, the ProcessFull of the processor is called. Otherwise, this method fails and no method of the processor is called. The first record has the lowest key of all.
Implements tkrzw::DBM.
|
overridevirtual |
Processes each and every record in the database with a processor.
proc | The pointer to the processor object. |
writable | True if the processor can edit the record. |
Precondition: The database is opened. The writable parameter should be consistent to the open mode.
The ProcessFull of the processor is called repeatedly for each record. The ProcessEmpty of the processor is called once before the iteration and once after the iteration.
Implements tkrzw::DBM.
|
overridevirtual |
Gets the number of records.
count | The pointer to an integer object to contain the result count. |
Precondition: The database is opened.
Implements tkrzw::DBM.
|
overridevirtual |
Gets the current file size of the database.
size | The pointer to an integer object to contain the result size. |
Precondition: The database is opened.
Implements tkrzw::DBM.
|
overridevirtual |
Gets the timestamp in seconds of the last modified time.
timestamp | The pointer to a double object to contain the timestamp. |
Precondition: The database is opened.
The timestamp is updated when the database opened in the writable mode is closed or synchronized, even if no updating opertion is done.
Implements tkrzw::DBM.
|
overridevirtual |
Gets the path of the database file.
path | The pointer to a string object to contain the result path. |
Precondition: The database is opened.
Implements tkrzw::DBM.
|
overridevirtual |
Removes all records.
Precondition: The database is opened as writable.
Implements tkrzw::DBM.
|
overridevirtual |
Rebuilds the entire database.
Precondition: The database is opened as writable.
Rebuilding a database is useful to reduce the size of the file by solving fragmentation. All tuning parameters are succeeded or calculated implicitly.
Implements tkrzw::DBM.
Status tkrzw::TreeDBM::RebuildAdvanced | ( | const TuningParameters & | tuning_params = TuningParameters() , |
bool | skip_broken_records = false , |
||
bool | sync_hard = false |
||
) |
Rebuilds the entire database, in an advanced way.
tuning_params | A structure for tuning parameters. The default value of each parameter means that the current setting is succeeded or calculated implicitly. |
skip_broken_records | If true, the operation continues even if there are broken records which can be skipped. |
sync_hard | True to do physical synchronization with the hardware before finishing the rebuilt file. |
Precondition: The database is opened as writable.
Rebuilding a database is useful to reduce the size of the file by solving fragmentation. Tuning parameters for the underlying hash database are reflected on the rebuilt file on the spot. Tuning parameters for B+ tree are reflected gradually while updating the database later. The comparator of record keys cannot be changed.
|
overridevirtual |
Checks whether the database should be rebuilt.
tobe | The pointer to a boolean object to contain the result decision. |
Precondition: The database is opened.
Implements tkrzw::DBM.
|
overridevirtual |
Synchronizes the content of the database to the file system.
hard | True to do physical synchronization with the hardware or false to do only logical synchronization with the file system. |
proc | The pointer to the file processor object, whose Process method is called while the content of the file is synchronized. If it is nullptr, it is ignored. |
Precondition: The database is opened as writable.
Implements tkrzw::DBM.
|
overridevirtual |
Inspects the database.
Implements tkrzw::DBM.
|
overridevirtual |
Checks whether the database is open.
Implements tkrzw::DBM.
|
overridevirtual |
Checks whether the database is writable.
Implements tkrzw::DBM.
|
overridevirtual |
Checks whether the database condition is healthy.
Precondition: The database is opened.
Implements tkrzw::DBM.
bool tkrzw::TreeDBM::IsAutoRestored | ( | ) | const |
Checks whether the database has been restored automatically.
Precondition: The database is opened.
|
overridevirtual |
Checks whether ordered operations are supported.
Implements tkrzw::DBM.
|
overridevirtual |
Makes an iterator for each record.
Precondition: The database is opened.
Implements tkrzw::DBM.
|
overridevirtual |
Makes a new DBM object of the same concrete class.
Implements tkrzw::DBM.
|
overridevirtual |
Gets the logger to write all update operations.
Implements tkrzw::DBM.
|
overridevirtual |
Sets the logger to write all update operations.
update_logger | The pointer to the update logger object. Ownership is not taken. If it is nullptr, no logger is used. |
Precondition: The database is not opened.
Implements tkrzw::DBM.
File* tkrzw::TreeDBM::GetInternalFile | ( | ) | const |
Gets the pointer to the internal file object.
Accessing the internal file viorates encapsulation policy. This should be used only for testing and debugging.
int64_t tkrzw::TreeDBM::GetEffectiveDataSize | ( | ) |
Gets the effective data size.
Precondition: The database is opened.
The effective data size means the total size of the keys and the values. This figure might deviate if auto restore happens.
int32_t tkrzw::TreeDBM::GetDatabaseType | ( | ) |
Gets the database type.
Precondition: The database is opened.
Status tkrzw::TreeDBM::SetDatabaseType | ( | uint32_t | db_type | ) |
Sets the database type.
db_type | The database type. |
Precondition: The database is opened as writable.
This data is just for applications and not used by the database implementation.
std::string tkrzw::TreeDBM::GetOpaqueMetadata | ( | ) |
Gets the opaque metadata.
Precondition: The database is opened.
Status tkrzw::TreeDBM::SetOpaqueMetadata | ( | const std::string & | opaque | ) |
Sets the opaque metadata.
opaque | The opaque metadata, of which leading 16 bytes are stored in the file. |
Precondition: The database is opened as writable.
This data is just for applications and not used by the database implementation.
KeyComparator tkrzw::TreeDBM::GetKeyComparator | ( | ) | const |
Gets the comparator of record keys.
Precondition: The database is opened.
Status tkrzw::TreeDBM::ValidateHashBuckets | ( | ) |
Validate all buckets in the hash table.
Status tkrzw::TreeDBM::ValidateRecords | ( | int64_t | record_base, |
int64_t | end_offset | ||
) |
Validates records in a region.
record_base | The beginning offset of records to check. Negative means the beginning of the record section. |
end_offset | The exclusive end offset of records to check. Negative means unlimited. 0 means the size when the database is synched or closed properly. |
|
static |
Parses metadata on an opaque data sequence.
opaque | The opaque data from the underlying database. |
num_records | The pointer to a variable to store the number of buckets. |
eff_data_size | The pointer to a variable to store the effective data size. |
root_id | The pointer to a variable to store the ID of the root node. |
first_id | The pointer to a variable to store the ID of the first node. |
last_id | The pointer to a variable to store the ID of the last node. |
num_leaf_nodes | The pointer to a variable to store the number of leaf nodes. |
num_inner_nodes | The pointer to a variable to store the number of inner nodes. |
max_page_size | The pointer to a variable to store the max page size. |
max_branches | The pointer to a variable to store the max branches. |
tree_level | The pointer to a variable to store the tree level. |
key_comp_type | The pointer to a variable to store the key comparator type. |
mini_opaque | The pointer to a variable to store the mini opaque data. |
|
static |
Restores a broken database as a new healthy database.
old_file_path | The path of the broken database. |
new_file_path | The path of the new database to be created. |
end_offset | The exclusive end offset of records to read. Negative means unlimited. 0 means the size when the database is synched or closed properly. INT64MIN and INT64MAX mean to omit restore of the underlying hash database. Then, INT64MIN is unlimited and INT64MAX means synched restoration. |
cipher_key | The encryption key for cipher compressors. |
|
staticconstexpr |
The default value of the offset width.
|
staticconstexpr |
The default value of the alignment power.
|
staticconstexpr |
The default value of the number of buckets.
|
staticconstexpr |
The default value of the capacity of the free block pool.
|
staticconstexpr |
The default value of the max page size.
|
staticconstexpr |
The default value of the max branches.
|
staticconstexpr |
The default value of the maximum number of cached pages.
|
staticconstexpr |
The size of the opaque metadata.