Al
|
e9277d7339
|
[utils] vector extend method
|
2015-07-25 01:33:45 -04:00 |
|
Al
|
cdb9afddd3
|
[fix] address training data carriage returns
|
2015-07-25 00:35:27 -04:00 |
|
Al
|
9fb1eae877
|
[expansion] Regenerating address data file
|
2015-07-24 16:09:22 -04:00 |
|
Al
|
cff72a0cb3
|
[dictionaries] Adding a few versions of the phrase "centro commerical" in French, Spanish and Italian after a review of addresses in those languages
|
2015-07-24 16:07:34 -04:00 |
|
Al
|
351c7c8c2e
|
[expansion] Add concatenated suffixes to the suffix keyspace of the address dictionary trie and concatenated prefixes and elisions to the prefix keyspace
|
2015-07-24 16:02:47 -04:00 |
|
Al
|
90a91cadd0
|
[search] Modifying trie_search_prefixes to use the new key schema
|
2015-07-24 15:59:49 -04:00 |
|
Al
|
bb7688d8d1
|
[phrases] trie_add_prefix method and a schema for prefix keys, e.g. elisions in French and Italian, separable prefixes like Hinter in German, etc.
|
2015-07-24 15:56:19 -04:00 |
|
Al
|
359cd62e20
|
[numex] Adding a replace_numeric_expressions method (returns NULL if no replacements were made), fixing lengths in situations where two unrelated numbers are joined by a stopword e.g. in the phrase "one and one" the "and" acts as a delimiter vs a phrase where the stopword acts as a joiner like "one hundred and twenty"
|
2015-07-24 15:31:05 -04:00 |
|
Al
|
12959aa483
|
[numex] Re-generating numex data
|
2015-07-24 15:24:03 -04:00 |
|
Al
|
5239c365d0
|
[docs] Adding some documentation for normalize.h options
|
2015-07-24 15:23:25 -04:00 |
|
Al
|
caf714f06f
|
[fix] typo and frivolous key
|
2015-07-24 15:22:57 -04:00 |
|
Al
|
87566bb6a5
|
[numex] Adding validation checks for numex JSON
|
2015-07-24 15:22:07 -04:00 |
|
Al
|
96538469dd
|
[utils] Adding a cstring_array_foreach macro
|
2015-07-23 15:57:12 -04:00 |
|
Al
|
27af28eacf
|
[expansion] Changes to address_expansion struct to allow for multiple dictionaries per record. Only adding unique canonical strings to the string array
|
2015-07-22 20:35:29 -04:00 |
|
Al
|
454be89121
|
[expansion] generated header and data files
|
2015-07-22 20:31:54 -04:00 |
|
Al
|
b27af13f8a
|
[expansion] Adding an array of dictionaries to each (phrase, canonical) pair
|
2015-07-22 20:24:14 -04:00 |
|
Al
|
0a9e92f11f
|
[expansion] Adding both key (for membership tests) and language-prefixed key to address dictionary
|
2015-07-22 17:21:09 -04:00 |
|
Al
|
09004aa5f1
|
[expansion] Constant for the "all" dictionary
|
2015-07-22 17:18:19 -04:00 |
|
Al
|
f61d993157
|
[expansion] removing the self param from address_dictionary methods, adding search_address_dictionaries method which searches a string for phrases in a particular language
|
2015-07-22 03:51:28 -04:00 |
|
Al
|
3da4b5d8c2
|
[numex] New numex generated data file
|
2015-07-22 02:24:16 -04:00 |
|
Al
|
ba8ff2b0c6
|
[expansion] Language prefixed keys
|
2015-07-22 02:16:22 -04:00 |
|
Al
|
157727d249
|
[fix] method name, strlen and fclose
|
2015-07-22 02:15:45 -04:00 |
|
Al
|
64a63fdf51
|
[mv] Moving all repo data files to a resources dir, data is only for runtime files
|
2015-07-21 18:11:36 -04:00 |
|
Al
|
a38b924c5d
|
[fix] add_token_alternatives
|
2015-07-21 17:26:59 -04:00 |
|
Al
|
71be52275d
|
[tokenization] Adding a version which of tokenize which keeps whitespace tokens
|
2015-07-21 17:26:20 -04:00 |
|
Al
|
5d21cb1604
|
[expansion] Address dictionary builder
|
2015-07-21 16:46:57 -04:00 |
|
Al
|
6eccde0df8
|
[fix] trie_set_data_at_index
|
2015-07-21 16:46:38 -04:00 |
|
Al
|
c798876b3d
|
[expansion] Address dictionary allocation, I/O, get/set
|
2015-07-21 16:46:15 -04:00 |
|
Al
|
2114b21399
|
[fix] A few anomalies in the Wikipedia/Wiktionary-generated given names
|
2015-07-21 16:07:28 -04:00 |
|
Al
|
3509b203f8
|
[gazetteers] Moving data out of the header file
|
2015-07-21 16:06:49 -04:00 |
|
Al
|
179918917a
|
[fix] header guard and include
|
2015-07-21 15:38:45 -04:00 |
|
Al
|
f99a90d64e
|
[expansion] Generated data file for address expansions
|
2015-07-21 15:38:10 -04:00 |
|
Al
|
68a6d8ee33
|
[fix] return NULL from transliterator_read on failure
|
2015-07-21 00:58:01 -04:00 |
|
Al
|
9360ff2c4b
|
[geodb] geodb_builder using new trie_get/set_data_at_index methds
|
2015-07-20 16:53:48 -04:00 |
|
Al
|
9374745140
|
[fix] var name and placement
|
2015-07-20 16:53:19 -04:00 |
|
Al
|
9f697e0256
|
[transliteration] transliterate now using the new trie_get_data_at_index API
|
2015-07-20 16:47:56 -04:00 |
|
Al
|
7f96726e82
|
[phrases] Adding trie_get_data/trie_set_data + at_index methods
|
2015-07-20 16:39:58 -04:00 |
|
Al
|
b9771921fc
|
[fix] Path joins in geodb_builder use new char_array methods
|
2015-07-20 16:31:43 -04:00 |
|
Al
|
d55d505329
|
[phrases] trie_get_data and trie_set_data interface for simpler dictionary-style trie get/set
|
2015-07-20 16:29:48 -04:00 |
|
Al
|
7f67ed7dc0
|
[fix] less ambiguous variable name in the generated expansions data file
|
2015-07-20 02:58:26 -04:00 |
|
Al
|
96d20f8693
|
[dictionaries] Removing the convention of separating ideograms with space, tokenizer can accomplish the same thing
|
2015-07-20 02:50:41 -04:00 |
|
Al
|
3ff6526392
|
[dictionaries] Azerbaijani dictionaries
|
2015-07-20 02:29:36 -04:00 |
|
Al
|
21b915f090
|
[dictionaries] Bosnian dictionaries and updates to Croatian
|
2015-07-20 02:29:23 -04:00 |
|
Al
|
c9280341b8
|
[languages] Adding Russian dictionaries to Georgia
|
2015-07-20 01:43:40 -04:00 |
|
Al
|
916465f994
|
[dictionaries] Georgian dictionaries
|
2015-07-20 01:42:45 -04:00 |
|
Al
|
b925e7b9a2
|
[dictionaries] Sinhala dictionaries
|
2015-07-20 00:51:59 -04:00 |
|
Al
|
bee9f7d5ec
|
[languages] Audit of road sign languages
|
2015-07-20 00:29:36 -04:00 |
|
Al
|
b415d79b10
|
[fix] space=>tabs
|
2015-07-19 22:26:28 -04:00 |
|
Al
|
c5c9f4db81
|
[languages] Sinhala is the primary language for Sri Lanka, English dictinaries used
|
2015-07-19 22:24:49 -04:00 |
|
Al
|
0b741a353f
|
[dictinaries] Icelandic dictionaries
|
2015-07-19 22:00:49 -04:00 |
|