Commit Graph

9 Commits

Author SHA1 Message Date
Al
fdf988cb27 [phrases] adding a public get_data_node method for tries 2015-05-19 18:02:29 -04:00
Al
eecee39904 [fix] giving constant trie node names more specificity 2015-05-18 14:24:39 -04:00
Al
4a67294fbf [phrases] adding get_prefix methods for tries, remove add_nodes_only, fixing a few things and inlining a few functions 2015-05-16 23:19:59 -04:00
Al
b2ba629f95 [fix] trie_get methods just return node index rather than data value 2015-04-27 01:28:05 -04:00
Al
8fb9bacfa6 [phrases] New trie_add_nodes_only method for concatenating strings to the trie, plus boolean return values on trie_add_* APIs 2015-04-27 01:01:43 -04:00
Al
8bc77372ef [phrases] exposing trie_add_at_index and trie_get_from_index for more control in the transliteration tries 2015-04-26 22:24:02 -04:00
Al
38ec03bf2b [phrases] default constructor for a trie uses a default alphabet derived from Wikipedia character frequencies for convenience. In practice the alphabet size/ordering matters only for very small tries or specialized alphabets. Mostly just use trie_new() 2015-03-05 13:40:52 -05:00
Al
10777ce973 [fix] debug logging only in trie.c 2015-03-03 13:28:43 -05:00
Al
585baab0a5 [phrases] optimized implementation of a double-array trie for storing millions of phrases compactly while being extremely quick to access. Supports utf-8, stores phrase tails in a contiguous character array separated by NUL bytes and stores offsets only so the chars at that offset can be treated as a regular C string and fed to things like strncmp. Also stores suffixes (primarily for languages like German, Dutch, etc. that concatenate street names e.g. Foobarstraße, Fobarweg) by prefixing the reversed string with the NUL byte and storing it backward in the trie, so can search forward and backward with the same data structure. 2015-03-03 13:18:18 -05:00