KyotoCabinet - a straightforward implementation of DBM
use KyotoCabinet;
Kyoto Cabinet is a library of routines for managing a database. The database is a simple data file containing records, each is a pair of a key and a value. Every key and value is serial bytes with variable length. Both binary data and character string can be used as a key and a value. Each key must be unique within a database. There is neither concept of data tables nor data types. Records are organized in hash table or B+ tree.
The following access methods are provided to the database: storing a record with a key and a value, deleting a record by a key, retrieving a record by a key. Moreover, traversal access to every key are provided. These access methods are similar to ones of the original DBM (and its followers: NDBM and GDBM) library defined in the UNIX standard. Kyoto Cabinet is an alternative for the DBM because of its higher performance.
Each operation of the hash database has the time complexity of "O(1)". Therefore, in theory, the performance is constant regardless of the scale of the database. In practice, the performance is determined by the speed of the main memory or the storage device. If the size of the database is less than the capacity of the main memory, the performance will seem on-memory speed, which is faster than std::map of STL. Of course, the database size can be greater than the capacity of the main memory and the upper limit is 8 exabytes. Even in that case, each operation needs only one or two seeking of the storage device.
Each operation of the B+ tree database has the time complexity of "O(log N)". Therefore, in theory, the performance is logarithmic to the scale of the database. Although the performance of random access of the B+ tree database is slower than that of the hash database, the B+ tree database supports sequential access in order of the keys, which realizes forward matching search for strings and range search for integers. The performance of sequential access is much faster than that of random access.
This library wraps the polymorphic database of the C++ API. So, you can select the internal data structure by specifying the database name in runtime. This library is NOT thread-safe for Perl ithread.
Install the latest version of Kyoto Cabinet beforehand and get the package of the Perl binding of Kyoto Cabinet.
Enter the directory of the extracted package then perform installation.
perl Makefile.PL make su make install
The package `KyotoCabinet' should be loaded in each source file of application programs.
use KyotoCabinet;
An instance of the class `DB' is used in order to handle a database. You can store, delete, and retrieve records with the instance.
The following code is a typical example to use a database.
use KyotoCabinet; # create the database object my $db = new KyotoCabinet::DB; # open the database if (!$db->open('casket.kch', $db->OWRITER | $db->OCREATE)) { printf STDERR ("open error: %s\n", $db->error); } # store records if (!$db->set('foo', 'hop') || !$db->set('bar', 'step') || !$db->set('baz', 'jump')) { printf STDERR ("set error: %s\n", $db->error); } # retrieve records my $value = $db->get('foo'); if (defined($value)) { printf("%s\n", $value); } else { printf STDERR ("get error: %s\n", $db->error); } # traverse records my $cur = $db->cursor; $cur->jump; while (my ($key, $value) = $cur->get(1)) { printf("%s:%s\n", $key, $value); } $cur->disable; if (!$db->close) { printf STDERR ("close error: %s\n", $db->error); }
The following code is a more complex example, which uses the Visitor pattern.
use KyotoCabinet; # create the database object my $db = new KyotoCabinet::DB; # open the database if (!$db->open('casket.kch', $db->OREADER)) { printf STDERR ("open error: %s\n", $db->error); } # define the visitor { package VisitorImpl; use base qw(KyotoCabinet::Visitor); # constructor sub new { my $self = new KyotoCabinet::Visitor; bless $self; return $self; } # call back function for an existing record sub visit_full { my ($self, $key, $value) = @_; printf("%s:%s\n", $key, $value); return $self->NOP; } # call back function for an empty record space sub visit_empty { my ($self, $key) = @_; printf STDERR ("%s is missing\n", $key); return $self->NOP; } } my $visitor = new VisitorImpl; # retrieve a record with visitor if (!$db->accept("foo", $visitor, 0) || !$db->accept("dummy", $visitor, 0)) { printf STDERR ("accept error: %s\n", $db->error); } # traverse records with visitor if (!$db->iterate($visitor, 0)) { printf STDERR ("iterate error: %s\n", $db->error); } # close the database if (!$db->close) { printf STDERR ("close error: %s\n", $db->error); }
The following code is also a complex example, which is suited to the Perl style.
use KyotoCabinet; # tie a hash variable to the database my $db = tie(my %db, 'KyotoCabinet::DB', 'casket.kch'); # store records $db{'foo'} = 'hop'; # string is fundamental $db{bar} = 'step'; # omitting quotation is ok $db{3} = 'jump'; # number is also ok # retrieve a record value printf("%s\n", $db{'foo'}); # update records in transaction $db->transaction(sub { $db{'foo'} = 2.71828; 1; }); # multiply a record value $db->accept('foo', sub { my ($key, $value) = @_; $value * 2; }); # traverse records by iterator while (my ($key, $value) = each(%db)) { printf("%s:%s\n", $key, $value); } # upcase values by iterator $db->iterate(sub { my ($key, $value) = @_; uc($value); }); # traverse records by cursor $db->cursor_process(sub { my ($cur) = @_; $cur->jump; while ($cur->accept(sub { my ($key, $value) = @_; printf("%s:%s\n", $key, $value); KyotoCabinet::Visitor::NOP; })) { $cur->step; } }); # untie the hash variable undef($db); untie(%db);
Namespace of Kyoto Cabinet.
The version information.
Convert a string to an integer.
@param str specifies the string.
@return the integer. If the string does not contain numeric expression, 0 is returned.
Convert a string with a metric prefix to an integer.
@param str the string, which can be trailed by a binary metric prefix. "K", "M", "G", "T", "P", and "E" are supported. They are case-insensitive.
@return the integer. If the string does not contain numeric expression, 0 is returned. If the integer overflows the domain, INT64_MAX or INT64_MIN is returned according to the sign.
Convert a string to a real number.
@param str specifies the string.
@return the real number. If the string does not contain numeric expression, 0.0 is returned.
Get the hash value of a string by MurMur hashing.
@param str the string.
@return the hash value.
Get the hash value of a string by FNV hashing.
@param str the string.
@return the hash value.
Calculate the levenshtein distance of two strings.
@param a one string.
@param b the other string.
@param utf flag to treat keys as UTF-8 strings. If it is omitted, false is specified.
@return the levenshtein distance.
This class expresses error data.
error code: success
error code: not implemented
error code: invalid operation
error code: no repository
error code: no permission
error code: broken file
error code: record duplication
error code: no record
error code: logical inconsistency
error code: system error
error code: miscellaneous error
Create an error object.
@param code the error code.
@param message the supplement message.
@return the error object.
Set the error information.
@param code the error code.
@param message the supplement message.
@return always undef.
Get the error code.
@return the error code.
Get the readable string of the code.
@return the readable string of the code.
Get the supplement message.
@return the supplement message.
Get the string expression.
@return the string expression.
@note This method overrides the stringification operator.
Compare itself with another error data.
@param right an error object or an error code.
@return true for the both operands are equal, or false if not.
@note This method overrides the comparison operator.
This class expresses the interface to access a record.
magic data: no operation
magic data: remove the record
Create a visitor object.
@return the visitor object.
Visit a record.
@param key the key.
@param value the value.
@return If it is a string, the value is replaced by the content. If it is KyotoCabinet::Visitor::NOP, nothing is modified. If it is KyotoCabinet::Visitor::REMOVE, the record is removed.
Visit a empty record space.
@param key the key.
@return If it is a string, the value is replaced by the content. If it is KyotoCabinet::Visitor::NOP or KyotoCabinet::Visitor::REMOVE, nothing is modified.
This class expresses the interface to process the database file.
Create a file processor object.
@return the file processor object.
Process the database file.
@param path the path of the database file.
@param count the number of records.
@param size the size of the available region.
@return true on success, or false on failure.
This class expresses the interface of cursor to indicate a record.
Create a cursor object.
@return the cursor object.
Disable the cursor.
@return always undef.
@note This method should be called explicitly when the cursor is no longer in use.
Accept a visitor to the current record.
@param visitor a visitor object which implements the Visitor interface. It can be the reference to a function.
@param writable true for writable operation, or false for read-only operation.
@param step true to move the cursor to the next record, or false for no move.
@return true on success, or false on failure.
@note To avoid deadlock, any explicit database operation must not be performed in this method.
Set the value of the current record.
@param value the value.
@param step true to move the cursor to the next record, or false for no move.
@return true on success, or false on failure.
Remove the current record.
@return true on success, or false on failure.
@note If no record corresponds to the key, false is returned. The cursor is moved to the next record implicitly.
Get the key of the current record.
@param step true to move the cursor to the next record, or false for no move.
@return the key of the current record, or undef on failure.
@note If the cursor is invalidated, undef is returned.
Get the value of the current record.
@param step true to move the cursor to the next record, or false for no move.
@return the value of the current record, or undef on failure.
@note If the cursor is invalidated, undef is returned.
Get a pair of the key and the value of the current record.
@param step true to move the cursor to the next record, or false for no move.
@return a pair of the key and the value of the current record, or undef on failure.
@note If the cursor is invalidated, undef is returned.
Get a pair of the key and the value of the current record and remove it atomically.
@return a pair of the key and the value of the current record, or undef on failure.
@note If the cursor is invalidated, undef is returned. The cursor is moved to the next record implicitly.
Jump the cursor to a record for forward scan.
@param key the key of the destination record. If it is undef, the destination is the first record.
@return true on success, or false on failure.
Jump the cursor to a record for backward scan.
@param key the key of the destination record. If it is undef, the destination is the last record.
@return true on success, or false on failure.
@note This method is dedicated to tree databases. Some database types, especially hash databases, will provide a dummy implementation.
Step the cursor to the next record.
@return true on success, or false on failure.
Step the cursor to the previous record.
@return true on success, or false on failure.
@note This method is dedicated to tree databases. Some database types, especially hash databases, will provide a dummy implementation.
Get the database object.
@return the database object.
Get the last happened error.
@return the last happened error.
Get the string expression.
@return the string expression.
@note This method overrides the stringification operator.
This class expresses the interface of database abstraction.
open mode: open as a reader
open mode: open as a writer
open mode: writer creating
open mode: writer truncating
open mode: auto transaction
open mode: auto synchronization
open mode: open without locking
open mode: lock without blocking
open mode: open without auto repair
merge mode: overwrite the existing value
merge mode: keep the existing value
merge mode: modify the existing record only
merge mode: append the new value
Create a database object.
@return the database object.
Get the last happened error.
@return the last happened error.
Open a database file.
@param path the path of a database file. If it is "-", the database will be a prototype hash database. If it is "+", the database will be a prototype tree database. If it is ":", the database will be a stash database. If it is "*", the database will be a cache hash database. If it is "%", the database will be a cache tree database. If its suffix is ".kch", the database will be a file hash database. If its suffix is ".kct", the database will be a file tree database. If its suffix is ".kcd", the database will be a directory hash database. If its suffix is ".kcf", the database will be a directory tree database. If its suffix is ".kcx", the database will be a plain text database. Otherwise, this function fails. Tuning parameters can trail the name, separated by "#". Each parameter is composed of the name and the value, separated by "=". If the "type" parameter is specified, the database type is determined by the value in "-", "+", ":", "*", "%", "kch", "kct", "kcd", kcf", and "kcx". All database types support the logging parameters of "log", "logkinds", and "logpx". The prototype hash database and the prototype tree database do not support any other tuning parameter. The stash database supports "bnum". The cache hash database supports "opts", "bnum", "zcomp", "capcnt", "capsiz", and "zkey". The cache tree database supports all parameters of the cache hash database except for capacity limitation, and supports "psiz", "rcomp", "pccap" in addition. The file hash database supports "apow", "fpow", "opts", "bnum", "msiz", "dfunit", "zcomp", and "zkey". The file tree database supports all parameters of the file hash database and "psiz", "rcomp", "pccap" in addition. The directory hash database supports "opts", "zcomp", and "zkey". The directory tree database supports all parameters of the directory hash database and "psiz", "rcomp", "pccap" in addition. The plain text database does not support any other tuning parameter.
@param mode the connection mode. KyotoCabinet::DB::OWRITER as a writer, KyotoCabinet::DB::OREADER as a reader. The following may be added to the writer mode by bitwise-or: KyotoCabinet::DB::OCREATE, which means it creates a new database if the file does not exist, KyotoCabinet::DB::OTRUNCATE, which means it creates a new database regardless if the file exists, KyotoCabinet::DB::OAUTOTRAN, which means each updating operation is performed in implicit transaction, KyotoCabinet::DB::OAUTOSYNC, which means each updating operation is followed by implicit synchronization with the file system. The following may be added to both of the reader mode and the writer mode by bitwise-or: KyotoCabinet::DB::ONOLOCK, which means it opens the database file without file locking, KyotoCabinet::DB::OTRYLOCK, which means locking is performed without blocking, KyotoCabinet::DB::ONOREPAIR, which means the database file is not repaired implicitly even if file destruction is detected.
@return true on success, or false on failure.
@note The tuning parameter "log" is for the original "tune_logger" and the value specifies the path of the log file, or "-" for the standard output, or "+" for the standard error. "logkinds" specifies kinds of logged messages and the value can be "debug", "info", "warn", or "error". "logpx" specifies the prefix of each log message. "opts" is for "tune_options" and the value can contain "s" for the small option, "l" for the linear option, and "c" for the compress option. "bnum" corresponds to "tune_bucket". "zcomp" is for "tune_compressor" and the value can be "zlib" for the ZLIB raw compressor, "def" for the ZLIB deflate compressor, "gz" for the ZLIB gzip compressor, "lzo" for the LZO compressor, "lzma" for the LZMA compressor, or "arc" for the Arcfour cipher. "zkey" specifies the cipher key of the compressor. "capcnt" is for "cap_count". "capsiz" is for "cap_size". "psiz" is for "tune_page". "rcomp" is for "tune_comparator" and the value can be "lex" for the lexical comparator, "dec" for the decimal comparator, "lexdesc" for the lexical descending comparator, or "decdesc" for the decimal descending comparator. "pccap" is for "tune_page_cache". "apow" is for "tune_alignment". "fpow" is for "tune_fbp". "msiz" is for "tune_map". "dfunit" is for "tune_defrag". Every opened database must be closed by the PolyDB::close method when it is no longer in use. It is not allowed for two or more database objects in the same process to keep their connections to the same database file at the same time.
Close the database file.
@return true on success, or false on failure.
Accept a visitor to a record.
@param key the key.
@param visitor a visitor object which implements the Visitor interface. It can be the reference to a function.
@param writable true for writable operation, or false for read-only operation.
@return true on success, or false on failure.
@note To avoid deadlock, any explicit database operation must not be performed in this method.
Accept a visitor to a record.
@param keys the reference to an array of the keys.
@param visitor a visitor object which implements the Visitor interface. It can be the reference to a function.
@param writable true for writable operation, or false for read-only operation.
@return true on success, or false on failure.
@note To avoid deadlock, any explicit database operation must not be performed in this method.
Iterate to accept a visitor for each record.
@param visitor a visitor object which implements the Visitor interface. It can be the reference to a function.
@param writable true for writable operation, or false for read-only operation.
@return true on success, or false on failure.
@note To avoid deadlock, any explicit database operation must not be performed in this method.
Set the value of a record.
@param key the key.
@param value the value.
@return true on success, or false on failure.
@note If no record corresponds to the key, a new record is created. If the corresponding record exists, the value is overwritten.
Add a record.
@param key the key.
@param value the value.
@return true on success, or false on failure.
@note If no record corresponds to the key, a new record is created. If the corresponding record exists, the record is not modified and false is returned.
Replace the value of a record.
@param key the key.
@param value the value.
@return true on success, or false on failure.
@note If no record corresponds to the key, no new record is created and false is returned. If the corresponding record exists, the value is modified.
Append the value of a record.
@param key the key.
@param value the value.
@return true on success, or false on failure.
@note If no record corresponds to the key, a new record is created. If the corresponding record exists, the given value is appended at the end of the existing value.
Add a number to the numeric integer value of a record.
@param key the key.
@param num the additional number.
@param orig the origin number if no record corresponds to the key. If it is negative infinity and no record corresponds, this method fails. If it is positive infinity, the value is set as the additional number regardless of the current value.
@return the result value, or undef on failure.
@note The value is serialized as an 8-byte binary integer in big-endian order, not a decimal string. If existing value is not 8-byte, this method fails.
Add a number to the numeric double value of a record.
@param key the key.
@param num the additional number.
@param orig the origin number if no record corresponds to the key. If it is negative infinity and no record corresponds, this method fails. If it is positive infinity, the value is set as the additional number regardless of the current value.
@return the result value, or undef on failure.
@note The value is serialized as an 16-byte binary fixed-point number in big-endian order, not a decimal string. If existing value is not 16-byte, this method fails.
Perform compare-and-swap.
@param key the key.
@param oval the old value. undef means that no record corresponds.
@param nval the new value. undef means that the record is removed.
@return true on success, or false on failure.
Remove a record.
@param key the key.
@return true on success, or false on failure.
@note If no record corresponds to the key, false is returned.
Retrieve the value of a record.
@param key the key.
@return the value of the corresponding record, or undef on failure.
Check the existence of a record.
@param key the key.
@return the size of the value, or -1 on failure.
Retrieve the value of a record and remove it atomically.
@param key the key.
@return the value of the corresponding record, or undef on failure.
Store records at once.
@param recs the reference to a hash of the records to store.
@return the number of stored records, or -1 on failure.
Remove records at once.
@param keys the reference to an array of the keys of the records to remove.
@return the number of removed records, or -1 on failure.
Retrieve records at once.
@param keys the reference to an array of the keys of the records to retrieve.
@return the reference to a hash of retrieved records, or undef on failure.
Remove all records.
@return true on success, or false on failure.
Synchronize updated contents with the file and the device.
@param hard true for physical synchronization with the device, or false for logical synchronization with the file system.
@param proc a postprocessor object which implements the FileProcessor interface. It can be the reference to a function.
@return true on success, or false on failure.
@note The operation of the processor is performed atomically and other threads accessing the same record are blocked. To avoid deadlock, any explicit database operation must not be performed in this method.
Occupy database by locking and do something meanwhile.
@param writable true to use writer lock, or false to use reader lock.
@param proc a processor object which implements the FileProcessor interface. It can be the reference to a function.
@return true on success, or false on failure.
@note The operation of the processor is performed atomically and other threads accessing the same record are blocked. To avoid deadlock, any explicit database operation must not be performed in this method.
Create a copy of the database file.
@param dest the path of the destination file.
@return true on success, or false on failure.
Begin transaction.
@param hard true for physical synchronization with the device, or false for logical synchronization with the file system.
@return true on success, or false on failure.
End transaction.
@param commit true to commit the transaction, or false to abort the transaction.
@return true on success, or false on failure.
Perform entire transaction by a functor.
@param proc the functor of operations during transaction. If the function returns true, the transaction is committed. If the function returns false, the transaction is aborted.
@param hard true for physical synchronization with the device, or false for logical synchronization with the file system.
@return true on success, or false on failure.
Dump records into a snapshot file.
@param dest the name of the destination file.
@return true on success, or false on failure.
Load records from a snapshot file.
@param src the name of the source file.
@return true on success, or false on failure.
Get the number of records.
@return the number of records, or -1 on failure.
Get the size of the database file.
@return the size of the database file in bytes, or -1 on failure.
Get the path of the database file.
@return the path of the database file, or undef on failure.
Get the miscellaneous status information.
@return the reference to a hash object of the status information, or undef on failure.
Get keys matching a prefix string.
@param prefix the prefix string.
@param max the maximum number to retrieve. If it is negative, no limit is specified.
@return the reference to an array of matching keys, or undef on failure.
Get keys matching a regular expression string.
@param regex the regular expression string.
@param max the maximum number to retrieve. If it is negative, no limit is specified.
@return the reference to an array of matching keys, or undef on failure.
Get keys similar to a string in terms of the levenshtein distance.
@param origin the origin string.
@param range the maximum distance of keys to adopt.
@param utf flag to treat keys as UTF-8 strings.
@param max the maximum number to retrieve. If it is negative, no limit is specified.
@return the reference to an array of matching keys, or undef on failure.
Merge records from other databases.
@param srcary the reference to an array of the source detabase objects.
@param mode the merge mode. KyotoCabinet::DB::MSET to overwrite the existing value, KyotoCabinet::DB::MADD to keep the existing value, KyotoCabinet::DB::MAPPEND to append the new value.
@return true on success, or false on failure.
Create a cursor object.
@return the return value is the created cursor object. Each cursor should be disabled with the Cursor::disable method when it is no longer in use.
Process a cursor by a functor.
@param proc the functor of operations for the cursor. The cursor is disabled implicitly after the block.
@return always undef.
Get the string expression.
@return the string expression.
@note This method overrides the stringification operator.
Process a database by a functor.
@param proc the functor to process the database, whose object is passd as the parameter.
@param path the same to the one of the open method.
@param mode the same to the one of the open method.
@return undef on success, or an error object on failure.
Tie a hash variable to a database file.
@param path the path of a database file.
@param mode the connection mode.
@return the inner database object, or undef on failure.
@note The database file is opened implicitly with the given parameters.
Untie a hash variable from the database file.
@return always undef.
@note The database file is closed implicitly.
Retrieve the value of a record.
@param key the key.
@return the value of the corresponding record, or undef on failure.
Store a record.
@param key the key.
@param value the value.
@return true on success, or false on failure.
Remove a record.
@param key the key.
@return true on success, or false on failure.
Remove all records.
@return true on success, or false on failure.
Check whether a record corrsponding a key exists.
@return true if the key exists, or false if not.
The inner methods `FIRSTKEY' and `NEXTKEY' are also implemented so that you can use the tying functions `each', `keys', and so on.
Copyright (C) 2009-2010 FAL Labs All rights reserved.
Kyoto Cabinet is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
Kyoto Cabinet is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.