Commit Graph

29 Commits

Author SHA1 Message Date
Al
d35f97f6f1 [fix] All file_read_uint64 calls that use stack variables read into a uint64_t not a size_t so as not to smash the stack under a 32-bit arch (issue #18) 2016-02-29 22:36:00 -05:00
Al
8c019998d7 [phrases] trie_num_keys 2016-01-05 22:02:15 -05:00
Al
22668945cb [mv] Moving trie_new_from_hash to a module 2016-01-05 16:43:17 -05:00
Al
3fe2365234 [fix] signed size_t in trie_set_tail 2015-10-27 13:21:26 -04:00
Al
ff8986a287 [phrases] trie_new_from_hash compresses a {str: uint32_t} hashtable into a trie in sorted order 2015-10-04 18:28:21 -04:00
Al
e122824448 [expansion] Adding the ability to search address dictionary phrases with a NULL language, will return phrases in any language 2015-09-15 14:00:26 -04:00
Al
a8f6617294 [phrases] Adding num_keys attribute to trie 2015-08-31 21:41:34 -04:00
Al
51572d6575 [phrases] Changing prefix/suffix chars so both are control characters and neither is the NUL-byte. Modifying transliteration special characters accordingly 2015-08-10 16:01:22 -04:00
Al
9b69d1f67a [fix] Removing C++ checks from all but the main API functions 2015-08-07 17:15:39 -04:00
Al
359a1efb03 [fix] Adding stdint.h include to most of the header files for portability 2015-08-07 02:43:44 -04:00
Al
bb7688d8d1 [phrases] trie_add_prefix method and a schema for prefix keys, e.g. elisions in French and Italian, separable prefixes like Hinter in German, etc. 2015-07-24 15:56:19 -04:00
Al
7f96726e82 [phrases] Adding trie_get_data/trie_set_data + at_index methods 2015-07-20 16:39:58 -04:00
Al
d55d505329 [phrases] trie_get_data and trie_set_data interface for simpler dictionary-style trie get/set 2015-07-20 16:29:48 -04:00
Al
af1a5f6213 [trie] trie_set_data_node method 2015-07-07 03:38:17 -04:00
Al
3d95875a11 [phrases] trie_add_len 2015-06-04 02:41:48 -04:00
Al
fa784677f2 [phrases] trie_add_suffix_at_index method 2015-06-04 02:30:53 -04:00
Al
17f88c3adc [utils] using unsigned ints in file_utils, adding doubles 2015-05-27 16:03:36 -04:00
Al
26ff3292d2 [fix] new script name, prefix result 2015-05-23 21:41:11 -04:00
Al
27171e068d [phrases] constant for NULL prefix results 2015-05-22 09:08:07 -04:00
Al
91ccdf6f7b [phrases] trie_get_prefix_* methods return a struct including tail position 2015-05-21 05:38:21 -04:00
Al
395fbcb8b5 [fix] get_prefix on tries searches tail as well 2015-05-21 05:22:44 -04:00
Al
1fee0a3e35 [phrases] separating get_data_node from tail_match for tries 2015-05-20 13:51:04 -04:00
Al
fdf988cb27 [phrases] adding a public get_data_node method for tries 2015-05-19 18:02:29 -04:00
Al
eecee39904 [fix] giving constant trie node names more specificity 2015-05-18 14:24:39 -04:00
Al
4a67294fbf [phrases] adding get_prefix methods for tries, remove add_nodes_only, fixing a few things and inlining a few functions 2015-05-16 23:19:59 -04:00
Al
8fb9bacfa6 [phrases] New trie_add_nodes_only method for concatenating strings to the trie, plus boolean return values on trie_add_* APIs 2015-04-27 01:01:43 -04:00
Al
8bc77372ef [phrases] exposing trie_add_at_index and trie_get_from_index for more control in the transliteration tries 2015-04-26 22:24:02 -04:00
Al
38ec03bf2b [phrases] default constructor for a trie uses a default alphabet derived from Wikipedia character frequencies for convenience. In practice the alphabet size/ordering matters only for very small tries or specialized alphabets. Mostly just use trie_new() 2015-03-05 13:40:52 -05:00
Al
585baab0a5 [phrases] optimized implementation of a double-array trie for storing millions of phrases compactly while being extremely quick to access. Supports utf-8, stores phrase tails in a contiguous character array separated by NUL bytes and stores offsets only so the chars at that offset can be treated as a regular C string and fed to things like strncmp. Also stores suffixes (primarily for languages like German, Dutch, etc. that concatenate street names e.g. Foobarstraße, Fobarweg) by prefixing the reversed string with the NUL byte and storing it backward in the trie, so can search forward and backward with the same data structure. 2015-03-03 13:18:18 -05:00