2021-01-29 Theppitak Karoonboonyanan * NEWS: === Version 0.2.13 === 2021-01-24 Theppitak Karoonboonyanan Update library versioning * configure.ac: Bump library versioning to reflect API addition. 2021-01-24 Theppitak Karoonboonyanan Rename trie_byte_strlen() to trie_char_strsize(). The object of the function is TrieChar string. Let's keep that semantics in the name. * datrie/trie-string.h, datrie/trie-string.c (trie_byte_strlen -> trie_char_strsize): - Rename the function. * datrie/tail.c (tail_get_serialized_size, tail_serialize): - Replace trie_byte_strlen() calls with the new name. 2021-01-24 Theppitak Karoonboonyanan Revise test_serialization. * tests/test_serialization.c (-trie_enum_mark_rec): - Drop unused callback function. * tests/test_serialization (main): - Drop unused 'is_failed' variable. - "%Ilu" -> "%lu" printf format. - Rearrange error handling in stack unwinding style. - Add '\n' to printf messages. - Free 'trieSerializedData'. We're not moving to C99 yet, but the declaration amid code is too useful to remove. And it's just in the test code, not in the main source. So we still allow it. 2021-01-24 Theppitak Karoonboonyanan Get rid of include in the new test. * tests/test_serialization.c (main): - Replace unlink() calls with remove() from and drop include. 2021-01-24 Theppitak Karoonboonyanan Adjust file-internal function declarations. * datrie/fileutils.c (parse_int16_be, serialize_int32_be, serialize_int16_be): - Re-declare functions as static. * datrie/fileutils.c (parse_int16_be): - Make the 'buff' arg const pointer. * datrie/fileutils.c: - Remove some blank lines in source. 2021-01-24 Theppitak Karoonboonyanan Fix documentation. * datrie/tail.h (Tail typedef): - Fix comment (Double-array -> Tail). 2021-01-24 Theppitak Karoonboonyanan Cosmetic changes. * datrie/fileutils.c: * datrie/trie-string.c: * datrie/alpha-map.c: * datrie/darray.h: * datrie/darray.c: * datrie/tail.h: * datrie/tail.c: * datrie/trie.c: * tools/trie-tool.c: * tests/test_serialization.c: - Use space before left parenthesis. - Use old-style C comments. - Remove trailing spaces. - Re-wrap lines. 2021-01-23 KOLANICH Added serialization of the trie into a memory buffer. * datrie/fileutils.c (file_write_int32, +serialize_int32, file_write_int16, +serialize_int16, file_read_int32, +parse_int32_be, file_read_int16, +parse_int16_be): - Split binary read/write operations into separate functions. * datrie/fileutils.h, datrie/fileutils.c (+serialize_int32_be_incr, +serialize_int16_be_incr): - Add serialization utility functions with pointer advancement. * datrie/trie-string.h, datrie/trie-string.c (+trie_byte_strlen): - Add utility method for calculating TrieChar string size in bytes. * datrie/alpha-map-private.h, datrie/alpha-map.c (+alpha_map_get_serialized_size, +alpha_map_serialize_bin): - Add AlphaMap serialization methods. * datrie/darray.h, datrie/darray.c (+da_get_serialized_size, +da_serialize): - Add DArray serialization methods. * datrie/tail.h, datrie/tail.c (+tail_get_serialized_size, +tail_serialize): - Add Tail serialization methods. * datrie/trie.h, datrie/trie.c (+trie_get_serialized_size, +trie_serialize): - Add Trie serialization methods. * datrie/libdatrie.map, datrie/libdatrie.def: - Add export symbols for Trie serialization. * tests/Makefile.am, +tests/test_serialization.c: - Add serialization test. Pull Request #12. 2021-01-22 Theppitak Karoonboonyanan Get rid of include. * tests/test_file.c (main): - Replace unlink() calls with remove() from and drop include, fixing build issue on Windows. Addressing Windows build issue differently from what proposed by @fanc999 in pull request #15. Thanks @fanc999 for first raising this. 2021-01-15 Theppitak Karoonboonyanan Use TRIE_CHAR_TERM in TAIL I/O methods. * datrie/tail.c (tail_fwrite): - Replace strlen() with trie_char_strlen() on suffix, which is TrieChar string. * datrie/tail.c (tail_fread): - Append TRIE_CHAR_TERM, rather than literal zero, as suffix terminator. 2021-01-15 Theppitak Karoonboonyanan Use TRIE_CHAR_TERM in TrieIterator methods. * datrie/trie.c (trie_iterator_get_key): - Replace strlen() with trie_char_strlen() on tail_str, which is TrieChar string. - Check tail_str termination against TRIE_CHAR_TERM, not zero. 2021-01-14 Theppitak Karoonboonyanan Fix wrong TRIE_CHAR_TERM semantics. * datrie/trie.h (trie_state_is_terminal): - Test for terminal state using zero AlphaChar. TRIE_CHAR_TERM is a TrieChar, although it's also accidentally zero. 2021-01-14 Theppitak Karoonboonyanan Reduce loops in alpha_map_recalc_work_area(). * datrie/alpha-map.c (alpha_map_recalc_work_area): - Instead of pre-filling trie-to-alpha map with errors and setting valid cells afterward, just fill error cells left after valid cells are done. Then, finally set TRIE_CHAR_TERM cell. 2021-01-10 Theppitak Karoonboonyanan Rewrite AlphaMap recalc. * datrie/alpha-map.c (alpha_map_recalc_work_area): Rewrite alpha-to-trie & trie-to-alpha maps recalculation - For clearer relation between the two maps - To allow other TRIE_CHAR_TERM values than zero 2021-01-09 Theppitak Karoonboonyanan Use TRIE_CHAR_TERM macro instead of zero. * datrie/alpha-map.c (alpha_map_char_to_trie, alpha_map_char_to_trie_str): * datrie/trie.c (trie_branch_in_branch, trie_branch_in_tail): * datrie/tail.c (tail_walk_str, tail_walk_char): - Use TRIE_CHAR_TERM instead of hard-wired zero when working with raw trie string termination. * datrie/trie-string.c (trie_string_terminate): - Append TRIE_CHAR_TERM instead of simply delegating to dstring_terminate(). 2021-01-09 Theppitak Karoonboonyanan Share static trie string functions internally. * datrie/trie-string.h, datrie/trie-string.c, datrie/tail.c (tc_strlen -> trie_char_strlen, tc_strdup -> trie_char_strdup): - Move private functions in tail.c to trie-string.[ch], with new full names under new "static trie string" section. - Check/assign string terminator using TRIE_CHAR_TERM instead of zero. * datrie/tail.c (tail_set_suffix): - Call the new trie_char_strdup() instead of tc_strdup(). * datrie/alpha-map.c (alpha_map_trie_to_char_str): - Call trie_char_strlen() instead of strlen(). 2021-01-06 Theppitak Karoonboonyanan Get rid of char semantics from TrieChar * datrie/trie.c (trie_branch_in_branch, trie_branch_in_tail): - Check null TrieChar with int zero instead of char. 2021-01-05 Theppitak Karoonboonyanan Use proper #include form in installed header. * datrie/alpha-map.h: - Use angle quotes form instead of double quotes in #include. 2021-01-05 Theppitak Karoonboonyanan Fix some documentations. * datrie/trie.c (trie_is_dirty): - Adjust wording to make clear the file is out of sync, not the other way around. * datrie/trie.c (trie_store, trie_store_if_absent): - Fix typo in the description of 'key' parameter. * datrie/trie.c (trie_store_if_absent): - Minor wording adjustment (inserted, not appended). 2020-12-30 Theppitak Karoonboonyanan Fix isspace() arg problem on NetBSD. * tools/trietool.c (command_add_list, string_trim): - Cast char to unsigned char before passing to isspace(). Thanks Sean for the report via a personal mail. 2019-12-20 Theppitak Karoonboonyanan Use GitHub issue tracker as bug report address. * configure.ac: - Replace bug report e-mail address with GitHub issue tracker URL. 2019-08-05 Theppitak Karoonboonyanan Stop installing README.migration It's supposed to be internal document now. * Makefile.am: - Remove README.migration from doc_DATA. 2019-01-21 Theppitak Karoonboonyanan Fix cross-compiling issue caused by AC_FUNC_MALLOC * configure.ac: - Replace AC_FUNC_MALLOC with AC_CHECK_FUNCS([malloc]), as we don't rely on GNU's malloc(0) behavior. Thanks Vanessa McHale for the report. Closes: #11 2018-11-23 Theppitak Karoonboonyanan Fix wrong key listing in byte trie * tests/Makefile.am, +tests/test_byte_list.c: - Add test case * datrie/alpha-map.c (alpha_map_recalc_work_area): - Index trie_to_alpha_map[] using TrieIndex, not TrieChar type, to prevent overflow upon incrementing over 0xff. - Drop tc variable and just reuse trie_last. Thanks @legale for the report. Closes: #9 https://github.com/tlwg/libdatrie/issues/9 2018-06-19 Theppitak Karoonboonyanan * configure.ac: - Bump library revision to reflect code changes. * NEWS: === Version 0.2.12 === 2018-06-19 Theppitak Karoonboonyanan Use HTTPS in URL * README: - Update document URL to HTTPS 2018-06-14 Theppitak Karoonboonyanan Cast (wchar_t *) to fix warnings in tests "%ls" printf() format requires (wchar_t *) [aka int *] arg. So, let's cast (AlphaChar *) [aka unsigned int *] to satisfy it. * tests/test_walk.c: * tests/test_iterator.c: * tests/test_store-retrieve.c: * tests/test_file.c: * tests/test_nonalpha.c: * tests/test_null_trie.c: - Add include, for wchar_t type - Cast "%ls" args from (AlphaChar *) to (wchar_t *) 2018-06-06 Theppitak Karoonboonyanan Avoid non-ANSI C snprintf() * tools/trietool.c (+full_path, prepare_trie, close_trie): - Instead of preparing full path name with snprintf(), which is non-ANSI, and still risks path name trimming, do it with size-calculated malloc(). - free() it as needed. 2018-06-04 Theppitak Karoonboonyanan Fix sscanf() string format * tools/trietool.c (prepare_trie): - Define b, e as unsigned int, as required by "%x" format. Fixing warning from '-Wformat=' gcc option. 2018-06-04 Theppitak Karoonboonyanan Fix compiler warnings in tests * tests/test_byte_alpha.c (main): * tests/test_file.c (main): * tests/test_iterator.c (main): * tests/test_nonalpha.c (main): * tests/test_null_trie.c (main): * tests/test_store-retrieve.c (main): * tests/test_term_state.c (main): * tests/test_walk.c (main): - Declare main function with 'main (void)'. Fixing warning from '-Wstrict-prototypes' gcc option. * tests/test_walk.c (main): - Split long string, which required C90 compilers. Fixing warning from '-Woverlength-strings' gcc option. 2018-06-04 Theppitak Karoonboonyanan Duplicate TrieChar string in more portable manner * datrie/tail.c (tail_set_suffix, +tc_strdup, +tc_strlen): - Replace cast strdup() with crafted implementation, allowing TrieChar to be of larger size than char. Fixing warning from '-Wint-to-pointer-cast' gcc option. 2018-06-04 Theppitak Karoonboonyanan Split long string * tools/trietool.c (usage): - Split help message which was too long and required C90 compiler. Caught by '-Woverlength-strings' gcc option. 2018-06-04 Theppitak Karoonboonyanan Remove unused byte, word, dword typedefs These are likely to conflict with other uses. * datrie/typedefs.h (-byte, -word, -dword): - Remove the unused typedefs Thanks Peter Moulder for the patch. 2018-06-04 Theppitak Karoonboonyanan Rename TRUE/FALSE in Bool enum to avoid clash Some other header file may have already define TRUE/FALSE. * datrie/typedefs.h (Bool): - Rename FALSE, TRUE to DA_FALSE, DA_TRUE respectively, and define FALSE, TRUE macros only if they haven't been defined. Thanks Peter Moulder for the patch. 2018-06-04 Theppitak Karoonboonyanan Declare argument-less functions with "(void)" "f()" declaration form is K&R style, specifying that no information about the number or types of parameters is supplied. This caused warnings on '-Wstrict-prototypes' gcc option. * datrie/alpha-map.h, datrie/alpha-map.c (alpha_map_new): * datrie/darray.h, datrie/darray.c (da_new, symbols_new): * datrie/tail.h, datrie/tail.c (tail_new): * tests/utils.h, tests/utils.c (en_alpha_map_new, en_trie_new): - Use "(void)" form in declaration - Also use "(void)" form in definition, for consistency Thanks Peter Moulder for the initial patch. 2018-05-24 Theppitak Karoonboonyanan Remove duplicate include * tools/trietool.c: - Remove duplicate include 2018-04-23 Theppitak Karoonboonyanan Add missing include in test * tests/test_byte_alpha.c: - Add missing include for utils.h 2018-04-23 Theppitak Karoonboonyanan * configure.ac: - Bump library revision to reflect code changes. * NEWS: === Version 0.2.11 === 2018-04-21 Theppitak Karoonboonyanan Fix reported segfault on full-range alpha map * tests/Makefile.am, +tests/test_byte_alpha.c: - Add test case * datrie/alpha-map.c (alpha_map_recalc_work_area()): - Redeclare trie_last as TrieIndex, to prevent overflow. Thanks Xiao Wang for the report, and @nevermatch for the analysis. Closes: #6 https://github.com/tlwg/libdatrie/issues/6 2018-03-29 Theppitak Karoonboonyanan Fix trie_state_get_data() at a prefix key When getting data from a state which terminates a key that is a prefix of another key, a terminator should be tried, so it jumps from DA to TAIL, where we can get the data. * tests/Makefile.am, +tests/test_term_state.c: - Add a test case with {'ab', 'abc'} dictionary, which fails previous code when retrieving data for key 'ab'. * datrie/trie.c (trie_state_get_data()): - Instead of simply checking for leaf state, which only caught a state in TAIL, also try walking with a terminator when still in DA. - Replace 'leaf state' with 'terminal state' in documentation, for more clarity. - Also return error on null state pointer. Thanks Filip Pytloun from the pytries project for the initial patch. 2017-09-06 Theppitak Karoonboonyanan Revise description about search time complexity * README: - Clarify that search time is O(m), where m is the key length, instead of O(1), while still claim that it's independent of database size. This closes #4. https://github.com/tlwg/libdatrie/issues/4 2016-12-14 Theppitak Karoonboonyanan Include git-version-gen in tarball * Makefile.am: - Add build-aux/git-version-gen to EXTRA_DIST. 2016-09-21 Theppitak Karoonboonyanan Fix iconv() return value checking. * tools/trietool.c (conv_to_alpha): - Check iconv() return value against (size_t) -1, rather than for its negativity, as size_t can be unsigned. Thanks Daniel Macks for the report on Issue #3. https://github.com/tlwg/libdatrie/issues/3 2016-09-21 Theppitak Karoonboonyanan Use versioning based on Git snapshot. * Makefile.am: - Add dist-hook to generate VERSION file on tarball generation. * +build-aux/git-version-gen: - Add script to generate version based on 'git describe' if in git tree, or using VERSION file if in release tarball. * configure.ac: - Call git-version-gen to get package version. 2015-10-20 Theppitak Karoonboonyanan * configure.ac: - Bump library revision to reflect code changes. * NEWS, configure.ac: === Version 0.2.10 === 2015-10-13 Theppitak Karoonboonyanan Optimize AlphaMap mapping. alpha_map_char_to_trie() is called everywhere before trie state transition. It's an important bottleneck. We won't change the persistent AlphaMap structure, but will add pre-calculated lookup tables for use at run-time. * datrie/alpha-map.c (struct _AlphaMap): - Add members for alpha-to-trie and trie-to-alpha lookup tables. * datrie/alpha-map.c (alpha_map_new, alpha_map_free): - Initialize & free the tables properly. * datrie/alpha-map.c (alpha_map_add_range -> alpha_map_add_range_only + alpha_map_recalc_work_area): - Split alpha_map_add_range() API into two parts: adding the range as usual and recalculate the lookup tables. * datrie/alpha-map.c (alpha_map_clone, alpha_map_fread_bin): - Call alpha_map_add_range_only() repeatedly before calling alpha_map_recalc_work_area() once. * datrie/alpha-map.c (alpha_map_char_to_trie, alpha_map_trie_to_char): - Look up the pre-calculated tables instead of calculating on every call. This appears to save time by 14.6% for total alpha_char_to_trie() calls and even lower its bottleneck rank by 1 rank on a libthai test case. It reduces 0.2% run time of the total libthai test case. Note that the time saved would be even more in case of multiple uncontinuous alphabet ranges, at the expense of more memory used. 2015-08-18 Theppitak Karoonboonyanan Fix doxygen version checking. * configure.ac: - Correctly compare doxygen versions. Simple expr comparison didn't work with version 1.8.10. Thanks Petr Gajdos for the patch. 2015-06-24 Theppitak Karoonboonyanan * datrie/tail.c (tail_set_suffix): - Catch strdup() failure. 2015-06-24 Theppitak Karoonboonyanan * configure.ac: Post-release version suffix added. 2015-05-03 Theppitak Karoonboonyanan * NEWS, configure.ac: === Version 0.2.9 === 2015-05-03 Theppitak Karoonboonyanan Use relative paths for symlinks. * tools/Makefile.am, man/Makefile.am: - Use relative paths for symlinks to avoid confusion in installation with DESTDIR. 2015-05-03 Theppitak Karoonboonyanan Also install symlink for old trietool. * man/Makefile.am: - Add hooks to install/uninstall symlink for old man page. 2015-05-02 Theppitak Karoonboonyanan * configure.ac: - Bump library revision to reflect code changes. 2015-04-29 Theppitak Karoonboonyanan Bump doxygen required version. * configure.ac: - Bump doxygen required version to 1.8.8, according to recent Doxyfile update. 2015-04-21 Theppitak Karoonboonyanan Fix infinite loop on empty trie iteration. * tests/Makefile.am, +tests/test_null_trie.c: - Add test case for empty trie iteration. * datrie/darray.c (da_first_separate): - Fix error condition after loop ending. Thanks Sergei Lebedev for the report via personal mail. Original report: https://github.com/kmike/datrie/issues/17 2015-04-12 Theppitak Karoonboonyanan Document about alphabet size. * datrie/trie.h: - Add to doc comment a description on the alphabet size limit and the mapped raw codes. Thanks edgehogapp for the suggestion. https://groups.google.com/forum/#!topic/thai-linux-foss-devel/U-O__IfviQ0 2015-04-11 Theppitak Karoonboonyanan Clarify Symbols' struct & methods. * datrie/darray.c (struct _Symbols): - Use TRIE_CHAR_MAX + 1 instead of hard-coded value for symbols[] array size. Thanks edgehogapp for the suggestion. https://groups.google.com/forum/#!topic/thai-linux-foss-devel/U-O__IfviQ0 * datrie/darray.h, datrie/darray.c (symbols_new, symbols_add): - Hide symbols_new() and symbols_add() for internal use. 2015-03-06 Theppitak Karoonboonyanan Update Doxyfile. * doc/Doxyfile.in: - Updated for doxygen 1.8.8 with 'doxygen -u'. 2015-03-02 Theppitak Karoonboonyanan Catch realloc failure. * datrie/tail.c (tail_alloc_block): - Check realloc() result on t->tails reallocation and return failure code if failed. * datrie/tail.c (tail_add_suffix): - Check return value from tail_alloc_block() and return failure code if failed. - Update documentation. 2015-03-02 Theppitak Karoonboonyanan Catch realloc failure. * datrie/darray.c (da_extend_pool): - Check realloc() result on d->cells reallocation and handle failure properly. 2015-02-27 Theppitak Karoonboonyanan Catch malloc failure. * datrie/tail.c (tail_fread): - Check malloc() result on suffix string and exit properly. 2015-02-26 Theppitak Karoonboonyanan More micro-optimization with LIKELY/UNLIKELY. * datrie/alpha-map.c (alpha_map_char_to_trie, alpha_map_trie_to_char): - Use UNLIKELY() when checking for NUL character. 2015-02-10 Theppitak Karoonboonyanan Fix 'make distcheck' failure. * doc/Makefile.am: - Remove doxygen db file on clean. 2015-02-10 Theppitak Karoonboonyanan More update of my e-mail address. * man/trietool.1: - Update my e-mail address. 2015-02-10 Theppitak Karoonboonyanan Rename trietool-0.2 utility to trietool. * configure.ac: - Check for ln -s * tools/Makefile.am: - Rename bin target from trietool-0.2 to trietool. - Add hooks to install/uninstall symlink with old name. * man/Makefile.am, man/trietool-0.2.1 -> man/trietool.1: - Rename & update manpage accordingly. 2015-02-06 Theppitak Karoonboonyanan Micro-optimize with likely/unlikely hints. * datrie/trie-private.h: - Add LIKELY() and UNLIKELY() macros based on compiler extension. * datrie/alpha-map.c (alpha_map_new, alpha_map_clone, alpha_map_fread_bin, alpha_map_add_range, alpha_map_char_to_trie_str, alpha_map_trie_to_char_str): * datrie/darray.c (symbols_new, da_new, da_fread, da_get_base, da_get_check, da_set_base, da_set_check, da_insert_branch, da_find_free_base, da_extend_pool): * datrie/dstring.c (dstring_new, dstring_ensure_space): * datrie/tail.c (tail_new, tail_fread, tail_get_suffix, tail_set_suffix, tail_get_data, tail_set_data, tail_walk_str, tail_walk_char): * datrie/trie.c (trie_new, trie_fread, trie_enumerate, trie_state_new, trie_state_walk, trie_state_is_walkable, trie_iterator_new): - Use LIKELY() and UNLIKELY() where it is known to be so, mostly for one-time initialization and failure handling. * datrie/alpha-map.c, datrie/tail.c, datrie/tail.c: - These are the files that need to include trie-private.h because of this. Callgrind says it does help speed up a little bit. 2015-02-05 Theppitak Karoonboonyanan Disable timestamp in Doxygen-generated doc. * doc/Doxyfile.in: - Set HTML_TIMESTAMP to NO to make the document reproducible. (reported by Debian Reproducible) 2015-02-01 Theppitak Karoonboonyanan * configure.ac: [Belated] post-release version suffix added. 2015-02-01 Theppitak Karoonboonyanan Update my e-mail address everywhere. * AUTHORS, configure.ac, datrie/*.[ch], tests/*.[ch], tools/trietool.c: - Replace all mentionings of my e-mail address with the gmail one. 2015-02-01 Theppitak Karoonboonyanan Fix binary file opening on Windows. * datrie/trie.c (trie_new_from_file, trie_save): - Add "b" to fopen() modes, so the binary file is opened properly on Windows. Thanks phongphan.p for the report and initial patch. 2014-01-10 Theppitak Karoonboonyanan * configure.ac: - Bump library revision to reflect code changes. * NEWS, configure.ac: === Version 0.2.8 === 2014-01-09 Theppitak Karoonboonyanan Improve documentation. * datrie/triedefs.h: - Refine descriptions of data types. * datrie/trie.c (trie_iterator_new): - Fix typo on trie_root() mentioning. * datrie/trie.c (trie_store, trie_store_if_absent): - Adjust wording. * datrie/alpha-map.h, datrie/trie.h: - Add detailed description of AlphaMap and Trie types. 2014-01-08 Theppitak Karoonboonyanan Clarify message in test_nonalpha. * tests/test_nonalpha.c (main): - Clarify message on false key duplication. 2014-01-08 Theppitak Karoonboonyanan Add test on keys with non-alphabet input chars. * tests/Makefile.am, +tests/test_nonalpha.c: - Add test to ensure that operations on keys with non-alphabet input chars fail. 2014-01-08 Theppitak Karoonboonyanan Fail trie operations on non-alphabet inputs. alpha_map_char_to_trie() tried to return TRIE_CHAR_MAX to indicate out-of-range error. But this value is indeed valid in trie operations. Doing so could allow false key duplication when different non-alphabet chars and TRIE_CHAR_MAX itself were all mapped to TRIE_CHAR_MAX. So, let's fail all trie operations on non-alphabet input chars. * datrie/alpha-map-private.h, datrie/alpha-map.c (alpha_map_char_to_trie): - Make alpha_map_char_to_trie return TrieIndex type, using TRIE_INDEX_MAX to indicate out-of-range error. This allows TRIE_CHAR_MAX to be returned as a valid output. * datrie/alpha-map.c (alpha_map_char_to_trie_str): - Fail if alpha_map_char_to_trie() returns error code. * datrie/trie.c (trie_retrieve, trie_store_conditionally, trie_delete, trie_state_walk, trie_state_is_walkable): - Check return value from alpha_map_char_to_trie() and return failure status on error. - Also cast TrieIndex return values to TrieChar on function calls. Thanks Naoki Youshinaga for the suggestion. 2014-01-07 Theppitak Karoonboonyanan Check for NULL result from AlphaMap string funcs. * datrie/trie.c (trie_store_conditionally): - Return failure on NULL alpha_map_char_to_trie_str(). 2014-01-07 Theppitak Karoonboonyanan Return NULL on allocation errors in AlphaMap funcs. * datrie/alpha-map.c (alpha_map_char_to_trie_str, alpha_map_trie_to_char_str): - Return NULL on malloc() error. 2014-01-03 Theppitak Karoonboonyanan Fix edge case with TRIE_CHAR_MAX as TrieChar. The trie input char with value TRIE_CHAR_MAX (255), was always skipped by double-array algorithms. Let's include it. * datrie/darray.c (da_has_children, da_output_symbols, da_relocate_base, da_first_separate, da_next_separate): - Include the last char in trie char iterations. * datrie/darray.c (da_first_separate, da_next_separate): - Declare characters as TrieIndex type instead of TrieChar, to prevent infinite loop due to unsigned char overflow. Thanks Naoki Youshinaga for the report, test case, and analysis. 2013-10-25 Theppitak Karoonboonyanan Fix comiler warnings in tests. * tests/test_walk.c (main): - Remove unused var i; - Remove extra printf() args. * tests/test_iterator.c: - Add missing #include for free(). * tests/test_walk.c (walk_dict), tests/utils.c (dict_src): - Cast string literals to (AlphaChar *) to fix signedness differences. 2013-10-25 Theppitak Karoonboonyanan * configure.ac: Post-release version suffix added. 2013-10-22 Theppitak Karoonboonyanan * NEWS, configure.ac: === Version 0.2.7.1 === 2013-10-21 Theppitak Karoonboonyanan * configure.ac: Bump library versioning to reflect API addition. (Change missing in previous release) 2013-10-21 Theppitak Karoonboonyanan * configure.ac: Post-release version suffix added. 2013-10-21 Theppitak Karoonboonyanan * NEWS, configure.ac: === Version 0.2.7 === 2013-10-21 Theppitak Karoonboonyanan Add missing distributed file. * tests/Makefile.am: - Add utils.h to distribution. 2013-10-20 Theppitak Karoonboonyanan Reorder tests from primitive to applied. * tests/Makefile.am: - Test walk & iterator before store-retrieve & file. 2013-10-20 Theppitak Karoonboonyanan Write a test suite for trie walk. * tests/test_walk.c: - Write test code. 2013-10-18 Theppitak Karoonboonyanan Write a test suite for trie store/retrieval. * tests/utils.h, tests/utils.c (+dict_src_n_entries): - Add function to get total entries in dict_src[]. * tests/test_store-retrieve.c (main): - Write test code. 2013-10-18 Theppitak Karoonboonyanan Fix messages in test_iterator. * tests/test_iterator.c (main): - s/file/trie/. No file is written or read in this test. 2013-10-18 Theppitak Karoonboonyanan Skip further iteration tests if key is NULL. * tests/test_iterator.c (main): - Insert 'continue' if trie_iterator_get_key() returns NULL. 2013-10-17 Theppitak Karoonboonyanan Document availibility of alpha_char_strcmp() * datrie/alpha-map.c (alpha_char_strcmp): - Document that it's available since 0.2.7. 2013-10-17 Theppitak Karoonboonyanan Write a test for trie iterator. * tests/test_iterator.c: - Write test suite for trie iterator. 2013-10-17 Theppitak Karoonboonyanan Add skeleton test suites & a test for file I/O. * configure.ac, Makefile.am, +tests/, +tests/Makefile.am: - Add tests/ dir to the build system. * +tests/utils.h, +tests/utils.c: * +tests/test_file.c: * +tests/test_iterator.c: * +tests/test_store-retrieve.c: * +tests/test_walk.c: - Add skeleton for test suites. * tests/utils.h, tests/utils.c, tests/test_file.c: - Write test suite for file I/O. 2013-10-17 Theppitak Karoonboonyanan Add alpha_char_strcmp() API. * datrie/alpha-map.h, datrie/alpha-map.c (+alpha_char_strcmp): - Add alpha_char_strcmp() declaration & body. * datrie/libdatrie.def, datrie/libdatrie.map: - Add alpha_char_strcmp symbols. 2013-10-16 Theppitak Karoonboonyanan Add missing info in alpha_map_add_range() doc. * datrie/alpha-map.c (alpha_map_add_range): - Add documentation on return value. 2013-09-23 Theppitak Karoonboonyanan Fix build for Visual Studio on Windows. * datrie/dstring.c (dstring_append, dstring_append_string, dstring_append_char, dstring_terminate): - Cast (void *) pointers to (char *) before calculating offsets, for portability. Thanks Gabi Davar for the report and fix (via Mikhail Korobov ). 2013-09-23 Theppitak Karoonboonyanan Check for doxygen required version. * configure.ac: - When doxygen-doc is enabled, also check doxygen version. 2013-09-23 Theppitak Karoonboonyanan Fix doxygen warning. * doc/Doxyfile.in: - doxygen no longer ships with the FreeSans font. Just drop it. 2013-09-23 Theppitak Karoonboonyanan Update Doxyfile. * doc/Doxyfile.in: - Updated with 'doxygen -u'. 2013-09-23 Theppitak Karoonboonyanan Fix compiler warnings. * datrie/trie-string.c (trie_string_append_string): * datrie/trie.c (trie_iterator_get_key): - Cast strlen() args from (const TrieChar *) to (const char *), to fix signedness mismatch warnings. 2013-09-23 Theppitak Karoonboonyanan Fix automake warnings. * datrie/Makefile.am, tools/Makefile.am: - Replace deprecated INCLUDES with AM_CPPFLAGS. 2013-09-23 Theppitak Karoonboonyanan * configure.ac: Post-release version suffix added. 2013-01-23 Theppitak Karoonboonyanan * NEWS, configure.ac: === Version 0.2.6 === 2013-01-22 Theppitak Karoonboonyanan Use xz compression for release tarball. * configure.ac: - Specify "dist-xz no-dist-gzip" options to AM_INIT_AUTOMAKE. 2012-08-06 Theppitak Karoonboonyanan Improve AlphaMap range merging. * datrie/alpha-map.c (alpha_map_add_range): - Also try to merge adjacent ranges. Otherwise, adding one by one character will result in linear search on alphabet set. Thanks Mikhail Korobov for the report. 2012-08-05 Theppitak Karoonboonyanan Migrate trie_enumerate() to the new TrieIterator. This improves performance by 10%, and recursion stacks are also eliminated. * datrie/trie.c (trie_enumerate): - Replace da_enumerate() call with TrieIterator loop. * datrie/trie.c (-trie_da_enum_func, -_TrieEnumData): * datrie/darray.h, datrie/darray.c (-da_enumerate, -da_get_transition_key, -DAEnumFunc, -da_enumerate_recursive): - Drop now-unused codes. 2012-08-05 Theppitak Karoonboonyanan Optimize key calculation for TrieIterator. * datrie/Makefile.am +datrie/dstring.h +datrie/dstring-private.h +datrie/dstring.c +datrie/trie-string.h +datrie/trie-string.c: - Add dynamic string classes DString and TrieString. * datrie/trie.c: - (TrieIterator): Add "key" dynamic trie string member for incrementally gathering the key while iterating. - Initialize "key" member to NULL on construction. Only allocate and free it on branching root states. * datrie/darray.h, datrie/darray.c (da_first_separate, da_next_separate): - Instead of allocating memory for return string, accept a dynamic string object and update it while traversing. * datrie/trie.c (trie_iterator_next): - For branching state, create the "key" dynamic string object on first iteration and pass it to da_first_separate() and da_next_separate() along the iterations. * datrie/trie.c (trie_iterator_get_key): - Replace the total key reconstruction by da_get_transition_key() with simple string catenation of the dynamic "key" and the suffix string. * datrie/darray.h, datrie/darray.c (da_get_transition_key): - Adjust to use dynamic string instead of total allocation. * datrie/darray.c (da_enumerate_recursive): - Adjust the da_get_transition_key() call accordingly. Thanks Mikhail Korobov for the suggested approach and for profiling check. 2012-07-31 Theppitak Karoonboonyanan Add TrieIterator and its operations. * datrie/trie.h, datrie/trie.c (+trie_iterator_new, +trie_iterator_free, +trie_iterator_next, +trie_iterator_get_key, +trie_iterator_get_data): - Add TrieIterator class and its methods. * datrie/libdatrie.def, datrie/libdatrie.map: - Add the new export symbols. * datrie/darray.h, datrie/darray.c (da_get_state_key -> da_get_transition_key): - Adjust da_get_state_key() from getting transition key from root to getting from arbitrary ancestor. * datrie/darray.h, datrie/darray.c (+da_first_separate, +da_next_separate): - Add internal functions for iterating from one separate node to another in double-array structure. Thanks Mikhail Korobov for the use case suggestion. 2012-07-30 Theppitak Karoonboonyanan * doc/Doxyfile.in: Upgrade to doxygen 1.8.1.2 format. 2012-07-29 Theppitak Karoonboonyanan * datrie/trie.c: Reformat source. 2012-07-29 Theppitak Karoonboonyanan * datrie/trie.h, datrie/trie.c (trie_state_walkable_chars): Reformat source. 2012-07-27 Theppitak Karoonboonyanan Add new API trie_state_walkable_chars() to allow breadth-first traversal. * datrie/darray.h, datrie/darray.c: - Move da_output_symbols() to darray.h, to be called by the new function. - Move the Symbols class to darray.h, as required by da_output_symbols(). * datrie/trie.h, datrie/trie.c (+trie_state_walkable_chars): - Add the new public function. * datrie/libdatrie.map, datrie/libdatrie.def: - Add the new symbol to export maps. * configure.ac: - Update library versioning to reflect API addition. 2012-07-26 Theppitak Karoonboonyanan * darray/darray.c (da_has_children): Accept (const DArray *) arg. 2012-07-26 Theppitak Karoonboonyanan * datrie/darray.c (da_has_children, da_output_symbols, da_relocate_base): Calculate max_c candidate using num_cells - base instead of TRIE_INDEX_MAX - base, to prevent more unnecessary loops. 2012-07-26 Theppitak Karoonboonyanan * datrie/trie.c (trie_state_get_data): Check if the state is leaf, not just suffix, before getting data. Thanks Mikhail Korobov for the report. 2012-07-25 Theppitak Karoonboonyanan * datrie/tail.h, datrie/tail.c: * datrie/alpha-map.c: * datrie/trie.h, datrie/trie.c: * datrie/darray.c: Remove trailing spaces. 2012-07-13 Theppitak Karoonboonyanan * datrie/tail.c (tail_get_suffix): Fix function documentation. Thanks Mikhail Korobov for the report. 2012-07-13 Theppitak Karoonboonyanan * datrie/darray.c, datrie/tail.c: - Don't include when compiled with MSVC, as the header is missing there, and SIZE_MAX is provided in some other header. Thanks Mikhail Korobov for the report. 2012-07-13 Theppitak Karoonboonyanan * configure.ac: Post-release version suffix added. 2011-11-04 Theppitak Karoonboonyanan * NEWS, configure.ac: === Version 0.2.5 === 2011-08-04 Theppitak Karoonboonyanan * datrie/alpha-map.h: Add missing 'extern "C"' for export functions to fix problem with C++ compiler. Thanks Aurimas Černius for the patch. 2011-03-13 Theppitak Karoonboonyanan * configure.ac: Post-release version suffix added. * datrie/trie.h: Add missing documentation for "user_data" parameter in TrieEnumFunc. 2010-06-30 Theppitak Karoonboonyanan * NEWS, configure.ac: === Version 0.2.4 === 2010-06-28 Theppitak Karoonboonyanan * datrie/alpha-map-private.h, datrie/alpha-map.c: * datrie/darray.h, datrie/darray.c: * datrie/tail.h, datrie/tail.c: * datrie/trie.c (trie_fread, trie_fwrite): Rename *_read() and *_write() functions to *_fread() and *_fwrite(), for consistency with the trie_f{read,write}() function. 2010-06-28 Theppitak Karoonboonyanan Add trie_fread() and trie_fwrite() interfaces for reading/writing trie data from an open file. Thanks NIIBE Yutaka for the suggestion. * datrie/trie.h (trie_fread, trie_fwrite): Add new API declarations. * datrie/trie.c (trie_new_from_file, trie_fread, trie_save, trie_fwrite): Refactor open file handling of trie_new_from_file() and trie_save() into trie_fread() and trie_fwrite(), and make the old functions do file opening/closing as wrappers to them. * datrie/libdatrie.def, datrie/libdatrie.map: Add symbol exports. 2010-06-27 Theppitak Karoonboonyanan * datrie/trie.c (trie_store_if_absent): Document that it's available since 0.2.4. * datrie/libdatrie.def, datrie/libdatrie.map: Add symbol exports for trie_store_if_absent(). * ChangeLog: Fix file locations in previous log. 2010-06-24 Theppitak Karoonboonyanan Add trie_store_if_absent() interface to avoid race condition in multi-thread applications. Thanks Dan Searle for the suggestion. * datrie/trie.h (trie_store_if_absent): Add new API declaration. * datrie/trie.c (trie_store_if_absent, trie_store, trie_store_conditionally): Refactor trie_store() into trie_store_conditionally() with extra arg is_overwrite, and make the two public functions mere wrappers to it. * configure.ac: Bump library version according to the added symbol. 2010-03-01 Theppitak Karoonboonyanan * datrie/trie.c (trie_save): Do not return before closing file. Thanks to Xu Jiandong for the report. 2010-03-01 Theppitak Karoonboonyanan * configure.ac: Post-release version suffix added. 2010-02-27 Theppitak Karoonboonyanan * configure.ac: Bump library revision. * NEWS, configure.ac: === Version 0.2.3 === 2010-02-25 Theppitak Karoonboonyanan * datrie/*.h, datrie/*.c: Add my e-mail address to license header. 2010-02-24 Theppitak Karoonboonyanan * datrie/*.h, datrie/*.c: Add license header to every source file. 2010-02-23 Theppitak Karoonboonyanan Move documentation from *.h to *.c, so libdatrie developers have the doc at hand. Users can still read the doxygen-generated doc BTW. * datrie/alpha-map.h: * datrie/alpha-map.c: * datrie/trie.h: * datrie/trie.c: * datrie/darray.h: * datrie/darray.c: * datrie/tail.h: * datrie/tail.c: Move doc comments from *.h to *.c. * doc/Doxyfile.in: Add alpha-map.c, trie.c to INPUT. * doc/Makefile.am: Add *.c to doxygen.stamp dependency. 2010-02-20 Theppitak Karoonboonyanan * datrie/darray.c (da_read), datrie/tail.c (tail_read): Protect against possible integer overflow on malicious trie file. 2010-02-20 Theppitak Karoonboonyanan * configure.ac: Post-release version suffix added. Be more robust against corrupted trie files. * datrie/alpha-map.c (alpha_map_read_bin): * datrie/darray.c (da_read): * datrie/tail.c (tail_read): - Check all returns from file_read_*() and clean up properly on failures - Adjust existing clean up codes to the new structure 2009-04-29 Theppitak Karoonboonyanan * configure.ac: Bump library revision. * NEWS, configure.ac: === Version 0.2.2 === 2009-04-28 Theppitak Karoonboonyanan * configure.ac: Check $datrie_cv_have_version_script against "yes", not "1", so symbol versioning works on GNU ld again. 2009-04-28 Theppitak Karoonboonyanan * datrie/trie.c (trie_state_copy): Use bitwise copy instead of member-wise. 2009-04-15 Theppitak Karoonboonyanan * configure.ac: Adjust variable name datrie_cv_have_version_script. 2009-04-14 Theppitak Karoonboonyanan Support locale charset query with libcharset, for mac and mingw. [Thanks Beamer User for the report.] * configure.ac: Check for locale_charset() from libiconv and nl_langinfo(CODESET) from libc. If neither is present, ask user to install GNU libiconv. * tools/trietool.c (init_iconv): Use locale_charset() to query locale charset if possible, fall back to nl_langinfo(CODESET) otherwise. 2009-04-14 Theppitak Karoonboonyanan Support alternative iconv implemetations. [Thanks cwt for the report.] * configure.ac: Check for GNU libconv or native libiconv if system libc doesn't have iconv(). * tools/Makefile.am: Add ICONV_LIBS to linker options. 2009-04-13 Theppitak Karoonboonyanan Fall back to libtool for linkers that do not support -version-script. [Thanks bact' for the report. Thanks cwt for the test.] * configure.ac: Check whether linker supports -version-script. * datrie/Makefile.am, +datrie/libdatrie.def: Apply -version-script flag only when linker supports it. Otherwise, fall back to the old method based on libtool -export-symbols flag. 2009-04-13 Theppitak Karoonboonyanan * configure.ac: Post-release version suffix added. * configure.ac (AC_CONFIG_MACRO_DIR), Makefile.am (ACLOCAL_AMFLAGS): Add m4 dir as automake includes, as required by the new libtool. [Thanks cwt for the report and test.] 2009-04-05 Theppitak Karoonboonyanan * configure.ac: Bump library revision. * NEWS, configure.ac: === Version 0.2.1 === 2009-04-05 Theppitak Karoonboonyanan * datrie-0.2.pc.in: Remove blank Requires: line. 2009-04-03 Theppitak Karoonboonyanan * datrie/alpha-map.h, datrie/trie.h: Revise documentation. 2009-04-03 Theppitak Karoonboonyanan * datrie/fileutils.h, datrie/fileutils.c (make_full_path, file_open, file_length): Remove unused codes. * datrie/triedefs.h (TrieIOMode): Remove unused typedef. 2009-04-01 Theppitak Karoonboonyanan * datrie/Makefile.am, datrie/libdatrie.def -> datrie/libdatrie.map: Replace libtool symbol exports with symbol versioning, to ease upgrading across SONAME. 2009-03-31 Theppitak Karoonboonyanan Fix gcc warnings. * tools/trietool.c (conv_to_alpha): Cast 'out_p' pointer before comparing. * tools/trietool.c (command_add_list, command_delete_list): Make sure 'saved_conv' is initialized before use. * datrie/trie.c (trie_new_from_file): Remove unused 'alt_map' var. 2009-03-31 Theppitak Karoonboonyanan * configura.ac: Post-release version suffix added. * datrie/trie.h (trie_save): Document parameter 'path'. * doc/Doxygen.in: Update format to doxygen 1.5.8. 2009-03-24 Theppitak Karoonboonyanan * NEWS, configure.ac: === Version 0.2.0 === 2009-03-24 Theppitak Karoonboonyanan * Makefile.am, +README.migration: Add migration documentation. 2008-12-29 Theppitak Karoonboonyanan * datrie/alpha-map.c (alpha_map_char_to_trie, alpha_map_trie_to_char): Tighten the loop for more readability, plus eliminating one duplicated check. 2008-12-28 Theppitak Karoonboonyanan * datrie/datrie.c (da_get_base, da_get_check, da_set_base, da_set_check)): Revert lower bound checks. It's no use checking too much for internal code. 2008-12-28 Theppitak Karoonboonyanan * datrie/trie.h, datrie/trie.c, datrie/libdatrie.def (trie_state_is_leaf, +trie_state_is_single): - Introduce a new state condition: single. A single state is a state in a single path, with no other branch til its leaf. - Redefine trie_state_is_leaf() as a macro based on it and trie_state_is_terminal(). 2008-12-26 Theppitak Karoonboonyanan * datrie/trie.h, datrie/trie.c, datrie/libdatrie.def (trie_state_copy): Add a new API function for trie state reuse support. 2008-12-15 Theppitak Karoonboonyanan * tools/trietool.c (conv_to_alpha): Use 'unsigned char' instead of 'uint8'. Better not tie to datrie internal notations too much. 2008-12-15 Theppitak Karoonboonyanan * configure.ac: Post-release version bump. * tools/trietool.c (conv_to_alpha): Use uint8 to access data bytes instead of char, to fix char signedness bug. 2008-12-15 Theppitak Karoonboonyanan * NEWS, configure.ac: === Version 0.1.99.2 === 2008-12-15 Theppitak Karoonboonyanan * man/Makefile.am, man/trietool.1 -> man/trietool-0.2.1: Rename 'trietool' man page to 'trietool-0.2', according to the corrsponding binary. 2008-12-14 Theppitak Karoonboonyanan * tools/Makefile.am: Rename 'trietool' program to 'trietool-0.2', to allow co-existence with datrie 0.1.x. 2008-12-14 Theppitak Karoonboonyanan * datrie/darray.c (da_read): * datrie/tail.c (tail_read): Restore file pointer on signature check failure. 2008-12-14 Theppitak Karoonboonyanan * man/trietool.1: Document that no more than 255 alphabets are allowed. 2008-12-13 Theppitak Karoonboonyanan Ensure that ranges in AlphaMap are always sorted and don't overlap. * datrie/alpha-map.c (struct _AlphaMap, alpha_map_new): - Remove 'last_range' member * datrie/alpha-map.c (alpha_map_add_range): - Check if the new range overlaps existing ranges and merge them as necessary. 2008-12-13 Theppitak Karoonboonyanan * configure.ac: Post-release version bump. * configure.ac, Makefile.am, datrie.pc.in -> datrie-0.2.pc.in: Rename pkg-config file, to allow co-existence with datrie 0.1.x. 2008-12-12 Theppitak Karoonboonyanan * NEWS, configure.ac: === Version 0.1.99.1 === 2008-12-12 Theppitak Karoonboonyanan * man/trietool.1: Update document - Trie is now stored in a single '*.tri' file - The alphabet map is renamed from '*.sbm' to '*.abm' - Mention Unicode, instead of single-byte character domain - Document the options for 'add-list' and 'delete-list' commands - Adjust troff formatting commands 2008-12-12 Theppitak Karoonboonyanan * tools/trietool.c (prepare_trie): - Try to read alphabet map from '*.abm' instead of '*.sbm' 2008-12-11 Theppitak Karoonboonyanan Allow specifying character encoding for word list file. * tools/trietool.c (command_add_list, command_delete_list): - Add '-e|--encoding ENC' option which temporarily override locale's codeset for character conversion * tools/trietool.c (usage): - Update usage message accordingly 2008-12-09 Theppitak Karoonboonyanan Use const where possible + general clean-ups. * datrie/alpha-map-private.h, datrie/alpha-map.c (alpha_map_write_bin): * datrie/alpha-map.h, datrie/alpha-map.c (alpha_map_clone): * datrie/alpha-map.c (alpha_map_get_total_ranges): - Accept (const AlphaMap *) arg * datrie/alpha-map.c (alpha_map_char_to_trie_str): - Rename 'alphabet_str' to 'trie_str', to be more sensible * datrie/darray.h, datrie/darray.c (da_write, da_walk, da_enumerate): * datrie/darray.c (da_output_symbols, da_get_state_key, da_enumerate_recursive): - Accept (const DArray *) arg * datrie/darray.h, datrie/darray.c (da_free): - Made void function, instead of int * datrie/darray.c (da_new): - Set CHECK[0] = d->num_cells, to be more clear * datrie/darray.c (da_write, da_extend_pool): - Update CHECK[0] whenever DArray::num_cells is changed, instead of just setting it before writing; so da_write() can now accept (const DArray *) arg * datrie/tail.h, datrie/tail.c (tail_write, tail_get_data, tail_walk_str, tail_walk_char): - Accept (const Tail *) arg * datrie/tail.h, datrie/tail.c (tail_free): - Made void function, instead of int * datrie/trie.c (struct _TrieState, struct _TrieEnumData): - 'trie' member is now const pointer * datrie/trie.h, datrie/trie.c (trie_new, trie_is_dirty, trie_retrieve, trie_enumerate, trie_root): * datrie/trie.c (trie_state_new): - Accept (const Trie *) arg 2008-12-09 Theppitak Karoonboonyanan Unicode (UCS-4) character support. * datrie/triedefs.h (AlphaChar): - unsigned char -> uint32 * datrie/alpha-map.h, datrie/alpha-map.c, datrie/libdatrie.def: - Export alpha_char_strlen() utility routine * tools/trietool.c (ProgEnv, init_conv, conv_to_alpha, conv_from_alpha, close_conv): - Add routines for converting characters between locale (LC_CTYPE) codeset and UCS-4 * tools/trietool.c (main): - Initialize and close conversion routines * tools/trietool.c (command_add, command_add_list, command_delete, command_delete_list, command_query, list_enum_func, command_list): - Convert character encodings between I/O and trie 2008-12-09 Theppitak Karoonboonyanan * datrie/trie.c (trie_retrieve, trie_store): - Use (AlphaChar *), not (TrieChar *), as key pointer type 2008-12-07 Theppitak Karoonboonyanan * datrie/trie.c (trie_retrieve): - Remove unused var 'len' - Compare AlphaChar with integer zero rather than '\0' 2008-12-07 Theppitak Karoonboonyanan Adjust APIs for in-memory trie support. * datrie/trie.c (struct _Trie): - Remove 'file' member; now trie is detached from file; file is closed after loading, and reopened when saving * datrie/trie.h, datrie/trie.c (trie_new): - Add public APIs: trie_new(), for non-file usage - One can still save it to file with trie_save(), BTW * datrie/trie.h, datrie/trie.c (trie_open, trie_new_from_file): - Rename trie_open() to trie_new_from_file() and make it accept only one pathname parameter, instead of separated dir and name - Alphabet map is now mandatory, rather than optional * datrie/trie.h, datrie/trie.c (trie_close, trie_free): - Rename trie_close() to trie_free() and do not bother with saving any more * datrie/trie.h, datrie/trie.c (trie_save): - Accept file path argument and open the file for saving * datrie/trie.h, datrie/trie.c (trie_is_dirty): - Add public API: trie_is_dirty() for client to determine whether saving is needed * datrie/alpha-map-private.h: - Separate internal APIs from public * datrie/alpha-map.h, datrie/alpha-map.c: - Promote alpha_map_new() and alpha_map_add_range() to public; they are now needed by trie_new() - Remove alpha_map_open() - Add public API: alpha_map_clone() - Document public APIs * datrie/darray.h, datrie/darray.c (da_new): - Add internal API: da_new() needed by trie_new() - Code migrated from new-file case in da_read() * datrie/darray.c (da_read): - No longer handle new file; read failure means an error * datrie/tail.h, datrie/tail.c (tail_new): - Add internal API: tail_new() needed by trie_new() - Code migrated from new-file case in tail_read() * datrie/tail.c (TAIL_SIGNATURE): - Update TAIL_SIGNATURE to harmonize with other data parts * datrie/tail.c (tail_read): - No longer handle new file; read failure means an error * tools/trietool.c (prepare_trie, close_trie): - Add helper function for openning and closing trie * tools/trietool.c (main): - Open and close trie with prepare_trie() and close_trie() * datrie/Makefile.am: - Install alpha-map.h as public header - Add alpha-map-private.h to source list * datrie/libdatrie.def: - Update exported symbols 2008-12-07 Theppitak Karoonboonyanan Rename AlphaMap functions to be more logical. * datrie/alpha-map.c, datrie/alpha-map.h: Rename functions - alpha_map_char_to_alphabet -> alpha_map_char_to_trie - alpha_map_alphabet_to_char -> alpha_map_trie_to_char - alpha_map_char_to_alphabet_str -> alpha_map_char_to_trie_str - alpha_map_alphabet_to_char_str -> alpha_map_trie_to_char_str * datrie/trie.c (trie_retrieve, trie_store, trie_delete, trie_da_enum_func, trie_state_walk, trie_state_is_walkable): - Call the AlphaMap functions with the new names 2008-12-07 Theppitak Karoonboonyanan Merge SBTrie alphabet mapping feature into Trie. * datrie/triedefs.h (AlphaChar): - Add AlphaChar typedef, as well as ALPHA_CHAR_ERROR macro (moved from UniChar type in datrie/alpha-map.h) * datrie/alpha-map.c (alpha_char_strlen): - Add string length function for alphabet string * datrie/alpha-map.c (struct _AlphaRange): - Use AlphaChar type instead of UniChar for begin, end members * datrie/alpha-map.c (alpha_map_get_total_ranges): - Add range count private method * datrie/alpha-map.c (alpha_map_add_range): - Add private method for adding range (refactored from alpha_map_open()) * datrie/alpha-map.c (alpha_map_open): - Call alpha_map_add_range() to add range, instead of doing low-level code * datrie/alpha-map.c, datrie/alpha-map.h (alpha_map_read_bin, alpha_map_write_bin): - Add methods for binary format I/O * datrie/alpha-map.c, datrie/alpha-map.h (alpha_map_char_to_alphabet, alpha_map_alphabet_to_char): - Accept and return AlphaChar, instead of UniChar * datrie/alpha-map.c, datrie/alpha-map.h (alpha_map_char_to_alphabet_str, alpha_map_alphabet_to_char_str): - Add public methods for mapping strings (migrated from sb_map_char_to_alphabet_str and sb_map_alphabet_to_char_str in datrie/sb-trie.c) * datrie/trie.c (struct _Trie): - Add alpha_map member * datrie/trie.c (trie_open): - Add code to read AlphaMap data block, and prefer text-formatted *.sbm if exists - Defer Trie object allocation to after file openning * datrie/trie.c (trie_close): - Free alpha_map member * datrie/trie.c (trie_save): - Add code to write AlphaMap data block * datrie/trie.c, datrie/trie.h (trie_retrieve, trie_store, trie_delete, trie_da_enum_func, trie_state_walk, trie_state_is_walkable): - Adjust function prototypes to accept AlphaChar instead of TrieChar - Add mapping between alphabet and trie character code * datrie/trie.h (TrieEnumFunc): - Adjust function typedef to accept AlphaChar instead of TrieChar * datrie/Makefile.am: - Remove sb-trie.c and sb-trie.h from source/header list * datrie/libdatrie.def: - Remove sb_trie symbols * tools/trietool.c: - Call trie_* functions instead of sb_trie_* 2008-12-03 Theppitak Karoonboonyanan Adjust file format by catenating *.br and *.tl data into a single *.tri file. * datrie/trie.c (struct _Trie): - Add 'file' and 'is_dirty' members * datrie/trie.c (trie_open): - Open the file and call da_read() and tail_read() to load data portions, instead of openning separate files * datrie/trie.c (trie_close): - Do the saving stuff and free DArray and Tail data, instead of separately closing them; then finally close the file * datrie/trie.c (trie_save): - Write file portions with da_write() and tail_write() instead of saving to separate files - Handle the 'is_dirty' stuffs * datrie/trie.c (trie_store, trie_branch_in_branch, trie_delete): - Set the 'is_dirty' flag * datrie/darray.h: Change prototypes for internal APIs whose functionalities are to be reduced: - da_open(path, name, mode) -> da_read(FILE*) - da_close(DArray*) -> da_free(DArray*) - da_save(DArray*) -> da_write(DArray*, FILE*) * datrie/darray.c (struct_DArray): - Drop 'file' and 'is_dirty' members * datrie/darray.c (da_open -> da_read): - Accept (FILE *) argument and drop file openning/closing codes - Store number of cells at CHECK[0], so double-array data size can determined without depending on file size - Do not allocate DArray object until needed - Drop 'is_dirty' stuffs * datrie/darray.c (da_close -> da_free): - Remove file stuffs; just free memory * datrie/darray.c (da_save -> da_write): - Accept (FILE *) argument and use it instead of DArray::file - Ensure CHECK[0] stores the number of cells - Drop 'is_dirty' stuffs * datrie/darray.c (da_set_base, da_set_check): - Drop 'is_dirty' stuffs * datrie/tail.h: Change prototypes for internal APIs whose functionalities are to be reduced: - tail_open(path, name, mode) -> tail_read(FILE*) - tail_close(Tail*) -> tail_free(Tail*) - tail_save(Tail*) -> tail_write(Tail*, FILE*) * datrie/tail.c (struct _Tail): - Drop 'file' and 'is_dirty' members * datrie/tail.c (tail_open -> tail_read): - Accept (FILE *) argument and drop file openning/closing codes - Check for new file from read failure, rather than file size - Do not allocate Tail object until needed - Drop 'is_dirty' stuffs * datrie/tail.c (tail_close -> tail_free): - Remove file stuffs; just free memory * datrie/tail.c (tail_save -> tail_write): - Accept (FILE *) argument and use it instead of Tail::file - Drop 'is_dirty' stuffs * datrie/tail.c (tail_set_suffix, tail_set_data, tail_free_block): - Drop 'is_dirty' stuffs 2008-12-01 Theppitak Karoonboonyanan Get rid of the weird TrieIndexInt intermediate type, by checking ranges instead. (Changes merged from r_0_1_x-branch) * datrie/darray.c: Remove typedef for TrieIndexInt. * datrie/darray.c (da_check_free_cell, da_extend_pool): - Accept normal TrieIndex arg instead of TrieIndexInt * datrie/datrie.c (da_insert_branch): - Define 'base', 'next' vars as TrieIndex instead of TrieIndexInt - Check overflow for 'next' before checking if the cell is free * datrie/datrie.c (da_find_free_base): - Define 's' var as TrieIndex instead of TrieIndexInt * datrie/datrie.c (da_fit_symbols): - Check overflow for (base + sym) before checking if the cell is free * datrie/datrie.c (da_get_base, da_get_check, da_set_base, da_set_check)): - Also check lower bound for index range 2008-11-27 Theppitak Karoonboonyanan First changes to break ABI for larger trie index. * datrie/triedefs.h: Redefine TrieIndex and TrieData as int32. Update TRIE_INDEX_MAX accordingly. * datrie/darray.c (da_open, da_save): Redefine DA_SIGNATURE. Read/write 32-bit data in headers. * datrie/darray.c (da_has_children, da_output_symbols, da_relocate_base): Declare characters as TrieIndex instead of uint16. * datrie/tail.c (tail_open, tail_save): Redefine TAIL_SIGNATURE. Read/write 32-bit data in headers. Use 16-bit length for each block. * configure.ac: Bump up library version to 1.0.0. 2008-06-21 Theppitak Karoonboonyanan * man/trietool.1: Use troff .in command to indent text, fixing warning from 'groff --warnings'. Thanks Debian's lintian. 2008-06-21 Theppitak Karoonboonyanan * datrie/tail.c (tail_set_suffix): * datrie/sb-trie.c (sb_map_char_to_alphabet_str): Fix GCC warnings about char signedness. 2008-01-28 Theppitak Karoonboonyanan * configure.in: Bump the library revision. * NEWS: === Version 0.1.3 === 2008-01-28 Theppitak Karoonboonyanan * man/trietool.1: Add documentation for the SBM file. 2008-01-28 Theppitak Karoonboonyanan * README: Fix my name in the reference. 2008-01-10 Theppitak Karoonboonyanan * datrie/tail.c (tail_set_suffix): Fix bug for the case in which suffix argument and tail's suffix overlap. Bug report and patch by shepmaster in http://linux.thai.net/node/102. 2008-01-10 Theppitak Karoonboonyanan * datrie/sb-trie.c (sb_trie_root): Return NULL pointer, rather than FALSE. Bug reported by shepmaster in http://linux.thai.net/node/101. 2007-10-17 Theppitak Karoonboonyanan * datrie/libdatrie.def: List only symbols in plain format, for Mac build. Thanks Vee Satayamas for the report. * datrie/Makefile.am: Add libdatrie.def as libdatrie_la_DEPENDENCIES. 2007-08-28 Theppitak Karoonboonyanan * doc/Doxyfile.in: Only generate doc for public API. 2007-08-28 Theppitak Karoonboonyanan * doc/Makefile.am, doc/Doxyfile.in: Revert API man pages generation and installation. Update Doxyfile format to doxygen 1.5.3. 2007-08-26 Theppitak Karoonboonyanan * man/trietool.1: Escape some minus signs. Mark a variable italic. Thanks debian's lintian. 2007-08-26 Theppitak Karoonboonyanan * configure.ac: Post-release version bump. 2007-08-25 Theppitak Karoonboonyanan * configure.ac: Bump lib revision. * NEWS: === Version 0.1.2 === 2007-08-25 Theppitak Karoonboonyanan * datrie/Makefile.am, +datrie/trie-private.h (MIN_VAL, MAX_VAL): Add utility macros. * datrie/darray.c (da_output_symbols): Adjust loop boundary to be more overflow-safe. * "-------------" (da_has_children, da_relocate_base): Apply the same loop pattern to prevent out-of-range accesses. 2007-08-24 Theppitak Karoonboonyanan * datrie/darray.c (da_output_symbols): Do not try to test symbols beyond trie index range. Fixes segfault for trietool list command. 2007-08-19 Theppitak Karoonboonyanan Handle double array index overflow. * datrie/triedefs.h (TRIE_INDEX_MAX): Define maximum index value. * datrie/darray.c (TrieIndexInt): Define type for immediate values, so overflow can be detected. * "-------------" (da_extend_pool): Return success/failure status. Accept TrieIndexInt argument for overflow detection. * "-------------" (da_check_free_cell): False when extending fails. Accept TrieIndexInt argument for overflow detection. * "-------------" (da_find_free_base): Return error on failure. * "-------------" (da_insert_branch): Return error on failure. * datrie/trie.c (trie_branch_in_branch, trie_branch_in_tail): Check for failure from da_insert_branch() and return the status. * datrie/darray.{c,h} (da_prune_upto, da_prune): Add da_prune_upto(), for rolling back partial operation in trie_branch_in_branch(). Redefine da_prune() in terms of da_prune_upto(). 2007-08-16 Theppitak Karoonboonyanan * datrie/darray.c (da_open, da_find_free_base): Use DA_POOL_BEGIN macro instead of hard-coded number. Remove unused DA_EXTENDING_STEPS. 2007-05-12 Theppitak Karoonboonyanan * datrie/sb-trie.c (sb_trie_close, sb_trie_save, sb_trie_retrieve, sb_trie_store, sb_trie_enumerate, sb_trie_root, sb_trie_state_clone, sb_trie_state_free, sb_trie_state_rewind, sb_trie_state_walk, sb_trie_state_is_walkable, sb_trie_state_is_terminal, sb_trie_state_is_leaf, sb_trie_state_get_data): Guard against NULL pointers in functions. Thanks to Neutron Soutmun for bug report and initial patch. 2007-04-06 Theppitak Karoonboonyanan * doc/Makefile.am: Add install-man target. Also install/uninstall doxygen-generated man pages. 2007-03-27 Theppitak Karoonboonyanan * configure.ac (LT_CURRENT, LT_REVISION, LT_AGE), datrie/Makefile.am (libdatrie_la_LDFLAGS), +datrie/libdatrie.def: Add library version info. Limit exported symbols with -export-symbols flag. Always pass -no-undefined flag. * configure.ac: Add Win32 DLL building support. 2006-11-02 Theppitak Karoonboonyanan * datrie/fileutils.{c,h} (file_read_int32, file_write_int32): Add int32 read/write functions. * datrie/fileutils.c (file_read_int16, file_write_int16): Use unsigned char buffer instead of and-ing with 0xff, in accordance with int32 functions. (Thanks to Vee Satayamas for suggestion). 2006-10-13 Theppitak Karoonboonyanan * configure.ac: Post-release version bump. 2006-10-12 Theppitak Karoonboonyanan * NEWS: === Version 0.1.1 === 2006-10-12 Theppitak Karoonboonyanan * configure.ac, Makefile.am, +man/Makefile.am, +man/trietool.1: Add manpage for trietool (moved from debian/). 2006-10-11 Theppitak Karoonboonyanan Fixed compiler warnings. * datrie/sb-trie.c (sb_map_alphabet_to_char_str): * datrie/tail.c (tail_open, tail_save, tail_set_suffix): * datrie/trie.c (trie_da_enum_func): Cast pointers to get rid of compiler warnings about char signedness. * tools/trietool.c (list_enum_func): Return value on exit. 2006-09-18 Theppitak Karoonboonyanan * configure.ac: Post-release version bump. 2006-09-18 Theppitak Karoonboonyanan * NEWS, configure.ac: === Version 0.1.0 === 2006-09-17 Theppitak Karoonboonyanan * README: Filled in. 2006-09-02 Theppitak Karoonboonyanan * datrie/triedefs.h, datrie/trie.h, datrie/sb-trie.h: Included headers using system header forms in installed headers. * datrie/Makefile.am (INCLUDES): Added include flag to ensure it compiles without prior installation. 2006-09-02 Theppitak Karoonboonyanan * datrie/alpha-map.c (alpha_map_char_to_alphabet, alpha_map_alphabet_to_char): Made sure terminator is always mapped with character 0. 2006-09-02 Theppitak Karoonboonyanan * datrie/darray.{h,c} (da_is_walkable), datrie/tail.{h,c} (tail_is_walkable_char): Made the tiny functions inline (i.e. macros), for tiny performance gain. 2006-09-02 Theppitak Karoonboonyanan * datrie/sb-trie.{h,c} (+sb_trie_state_is_walkable): Added walkability test wrapper. * datrie/sb-trie.h (sb_trie_state_is_terminal, sb_trie_state_is_leaf), datrie/trie.h (trie_state_is_terminal, trie_state_is_leaf): Fixed typo for "\brief" doxygen tag. 2006-09-02 Theppitak Karoonboonyanan * datrie/trie.{h,c} (trie_state_is_terminal, +trie_state_is_walkable): Changed trie_state_is_terminal() into a generic walkability test, and made itself a specialized macro calling the function. 2006-08-31 Theppitak Karoonboonyanan * datrie/darray.{h,c} (+da_is_walkable), datrie/tail.{h,c} (+tail_is_walkable_char), datrie/trie.c (tail_state_is_terminal): Tested walkability by peeking, instead of trying with a cloned state. * datrie/tail.{h,c} (tail_walk_char): Removed redundant const in parameter. 2006-08-29 Theppitak Karoonboonyanan * datrie/trie.h, datrie/sb-trie.h: Wrapped extern "C" in public headers for compiling with C++ code. 2006-08-22 Theppitak Karoonboonyanan * tools/trietool.c (decode_command): Exited with proper return values. * tools/trietool.c (command_add_list): Removed warning on missing data for keys. This would be normal for data-less dictionaries. 2006-08-21 Theppitak Karoonboonyanan * datrie/trie.{h,c} (trie_state_rewind), datrie/sb-trie.{h,c} (sb_trie_state_rewind): Added API to rewind a trie state to root, so users do not need to reallocate to do so. 2006-08-21 Theppitak Karoonboonyanan * datrie/alpha-map.c (alpha_map_open, alpha_map_new): Better used a dedicated function to initialize the map. 2006-08-21 Theppitak Karoonboonyanan * datrie/alpha-map.c (alpha_map_open): Initialized map list before using. Also skipped mal-formed input lines. * tools/trietool.c (command_add_list): Removed duplicated return. 2006-08-21 Theppitak Karoonboonyanan * configure.ac, Makefile.am, +datrie.pc.in: Added pkgconfig file. 2006-08-20 Theppitak Karoonboonyanan * datrie/sb-trie.{h,c} (sb_trie_state_is_terminal), datrie/trie.{h,c} (trie_state_is_terminal, trie_state_is_leaf): Added API for terminal node check and distinguish it from leaf node. (Terminal node can be in either branch or tail, while leaf can only be in tail.) 2006-08-20 Theppitak Karoonboonyanan * datrie/Makefile.am (pkginclude_HEADERS): Installed sb-trie.h. * datrie/sb-trie.h: Fixed file name in doxygen tag. 2006-08-20 Theppitak Karoonboonyanan * datrie/Makefile.am, +datrie/alpha-map.{h,c}, +datrie/sb-trie.{h,c}: Added alphabet map to map between character set and trie alphabet codes. Also added SBTrie wrapper for 8-bit character sets. * datrie/triedefs.h (TRIE_CHAR_MAX): Changed to 255, to fit char type. * datrie/trie.{h,c} (trie_state_walk): Removed unnecessary const in character argument. * tools/trietool.c: Used SBTrie instead of plain Trie. 2006-08-20 Theppitak Karoonboonyanan * datrie/fileutils.c (file_read_int16): Fixed bitwise calculation. The second byte should be masked to get rid of possible sign bits introduced by type conversion. 2006-08-19 Theppitak Karoonboonyanan * datrie/fileutils.c (file_read_int16, file_write_int16): Used shift operations to serialize int, eliminating dependency on . Thanks Vee Satayamas for the suggestion. 2006-08-18 Theppitak Karoonboonyanan * datrie/trie.c (trie_retrieve, trie_store, trie_delete): Always walk the null-terminator in tail. Otherwise, comparison with shorter key will terminate at separate node. 2006-08-18 Theppitak Karoonboonyanan * datrie/darray.c (find_free_base): Extended pool before getting exhausted. * tools/trietool.c (command_add_list): Let tab and comma be field delimitors, rather than white spaces in general. * tools/trietool.c (list_enum_func): Do not pad space when printing key data. 2006-08-17 Theppitak Karoonboonyanan * configure.ac, Makefile.am, +doc/Makefile.am, +doc/Doxyfile.in: Generated document using doxygen. 2006-08-17 Theppitak Karoonboonyanan * datrie/darray.c (da_prune, da_num_children -> da_has_children): Just checked whether a node has at least one child, instead of counting children and comparing with zero, as a small optimization. 2006-08-17 Theppitak Karoonboonyanan * tools/trietool.c (command_add_list, command_delete_list): Implemented. 2006-08-17 Theppitak Karoonboonyanan * datrie/trie.{h,c} (trie_enumerate), datrie/darray.{h,c} (da_enumerate, da_enumerate_recursive, da_get_state_key): Added key enumeration method. * tools/trietool.c (command_list): Implemented. 2006-08-17 Theppitak Karoonboonyanan * datrie/trie.{h,c} (trie_delete), datrie/darray.{h,c} (da_prune, da_num_children), datrie/tail.{h,c} (tail_delete): Added key deletion method. * datrie/tail.c (tail_save): Guarded against null suffix. * tools/trietool.c (command_delete): Implemented. 2006-08-17 Theppitak Karoonboonyanan * datrie/darray.c (da_find_free_base): Made sure the free cell for first symbol is beyond header cells. Also repeatedly extended the pool until a free cell is found, in case free list is restarted. 2006-08-16 Theppitak Karoonboonyanan * datrie/fileutils.c (file_open): Created new file only if it does not exist. * datrie/trie.c (trie_branch_in_branch): Also set data for tail block. * datrie/trie.c (trie_branch_in_tail): Do not free the const suffix block, fixing double free bug. * datrie/darray.c (da_insert_branch): Covered the case of negative base, for branching from a separate node. * tools/trietool.c (command_add): Removed debug message. 2006-08-16 Theppitak Karoonboonyanan * configure.ac, Makefile.am, +tools/Makefile.am, +tools/trietool.c: Added trietool utility. * datrie/darray.c (da_get_free_list): Fixed typo in macro name. * datrie/datrie.c (da_extend_pool): Updated num_cells immediately after realloc(), to let the cell accesses pass boundary checks. * datrie/tail.c (tail_get_suffix, tail_set_suffix, tail_alloc_block, tail_free_block, tail_get_data, tail_set_data, tail_walk_str, tail_walk_char): Started tail blocks indexing from 1 (defined as TAIL_START_BLOCKNO macro) rather than 0, because we use signed values to distinguish pointers in darray. * datrie/tail.{c,h} (tail_get_suffix), datrie/trie.c (trie_branch_in_tail): Made tail_get_suffix return const pointer. * datrie/darray.c (da_close, da_save), datrie/tail.c (tail_close, tail_save): Checked errors and returned appropriate codes. * datrie/trie.c (trie_open): Checked errors on files openning and resumed appropriately. 2006-08-15 Theppitak Karoonboonyanan * datrie/Makefile.am, +datrie/fileutils.c: Added fileutils.c for implementation of file utility functions. * datrie/fileutils.{c,h}, datrie/darray.c (da_open), datrie/tail.c (tail_open): Adjusted file_read_int{8,16} API so error can be checked. 2006-08-15 Theppitak Karoonboonyanan * datrie/Makefile.am, +datrie/tail.c, datrie/tail.h: Added tail.c for trie suffix implementation. Adjusted some API to not require size_t. * datrile/fileutils.h: Added more functions required by tail.c. * datrie/trie.c (trie_branch_in_tail): Added check for null suffix. 2006-08-14 Theppitak Karoonboonyanan * datrie/Makefile.am, +datrie/darray.c, +datrie/fileutils.h: Added darray.c for double-array structure implementation, and fileutils.h declarations for keeping file manipulation functions. * datrie/triedefs.h: Added TRIE_CHAR_MAX constant for alphabet enumeration. Changed TRIE_INDEX_ERROR to 0, as negative number has its own meaning. 2006-08-12 Theppitak Karoonboonyanan * === First import the project ===