Al
|
27af28eacf
|
[expansion] Changes to address_expansion struct to allow for multiple dictionaries per record. Only adding unique canonical strings to the string array
|
2015-07-22 20:35:29 -04:00 |
|
Al
|
454be89121
|
[expansion] generated header and data files
|
2015-07-22 20:31:54 -04:00 |
|
Al
|
0a9e92f11f
|
[expansion] Adding both key (for membership tests) and language-prefixed key to address dictionary
|
2015-07-22 17:21:09 -04:00 |
|
Al
|
09004aa5f1
|
[expansion] Constant for the "all" dictionary
|
2015-07-22 17:18:19 -04:00 |
|
Al
|
f61d993157
|
[expansion] removing the self param from address_dictionary methods, adding search_address_dictionaries method which searches a string for phrases in a particular language
|
2015-07-22 03:51:28 -04:00 |
|
Al
|
3da4b5d8c2
|
[numex] New numex generated data file
|
2015-07-22 02:24:16 -04:00 |
|
Al
|
ba8ff2b0c6
|
[expansion] Language prefixed keys
|
2015-07-22 02:16:22 -04:00 |
|
Al
|
157727d249
|
[fix] method name, strlen and fclose
|
2015-07-22 02:15:45 -04:00 |
|
Al
|
a38b924c5d
|
[fix] add_token_alternatives
|
2015-07-21 17:26:59 -04:00 |
|
Al
|
71be52275d
|
[tokenization] Adding a version which of tokenize which keeps whitespace tokens
|
2015-07-21 17:26:20 -04:00 |
|
Al
|
5d21cb1604
|
[expansion] Address dictionary builder
|
2015-07-21 16:46:57 -04:00 |
|
Al
|
6eccde0df8
|
[fix] trie_set_data_at_index
|
2015-07-21 16:46:38 -04:00 |
|
Al
|
c798876b3d
|
[expansion] Address dictionary allocation, I/O, get/set
|
2015-07-21 16:46:15 -04:00 |
|
Al
|
3509b203f8
|
[gazetteers] Moving data out of the header file
|
2015-07-21 16:06:49 -04:00 |
|
Al
|
179918917a
|
[fix] header guard and include
|
2015-07-21 15:38:45 -04:00 |
|
Al
|
f99a90d64e
|
[expansion] Generated data file for address expansions
|
2015-07-21 15:38:10 -04:00 |
|
Al
|
68a6d8ee33
|
[fix] return NULL from transliterator_read on failure
|
2015-07-21 00:58:01 -04:00 |
|
Al
|
9360ff2c4b
|
[geodb] geodb_builder using new trie_get/set_data_at_index methds
|
2015-07-20 16:53:48 -04:00 |
|
Al
|
9374745140
|
[fix] var name and placement
|
2015-07-20 16:53:19 -04:00 |
|
Al
|
9f697e0256
|
[transliteration] transliterate now using the new trie_get_data_at_index API
|
2015-07-20 16:47:56 -04:00 |
|
Al
|
7f96726e82
|
[phrases] Adding trie_get_data/trie_set_data + at_index methods
|
2015-07-20 16:39:58 -04:00 |
|
Al
|
b9771921fc
|
[fix] Path joins in geodb_builder use new char_array methods
|
2015-07-20 16:31:43 -04:00 |
|
Al
|
d55d505329
|
[phrases] trie_get_data and trie_set_data interface for simpler dictionary-style trie get/set
|
2015-07-20 16:29:48 -04:00 |
|
Al
|
1d7247d7e1
|
[polygons] Adding Belgium regional languages
|
2015-07-17 00:53:25 -04:00 |
|
Al
|
5f2be3022b
|
[expansion] dictionary_type_t enum instead of uint64_t
|
2015-07-16 03:49:37 -04:00 |
|
Al
|
f713c53993
|
[utils] Adding an option to char_array_add_joined to strip separators for path manipulation
|
2015-07-16 03:49:00 -04:00 |
|
Al
|
f181c04e7a
|
[expansion] expansion rule structs and Python script to generate rules from dictionaries tree. Note that a canonical_index of -1 indicates that a given phrase is the canonical (saves space)
|
2015-07-16 02:49:53 -04:00 |
|
Al
|
a8b2fb5b90
|
[tokenization] Regenerating scanner file
|
2015-07-14 18:16:24 -04:00 |
|
Al
|
43293d0ae3
|
[tokenization] Fixing a tokenization where mid-number characters appear in the middle of a word+numeric sequence e.g. Zigor,2 should be 3 separate tokens. Sequences like 35,37,39 are still treated as a single token for the moment.
|
2015-07-14 18:15:58 -04:00 |
|
Al
|
a9967ec9bd
|
[numex] Regenerating numex file
|
2015-07-13 01:16:39 -04:00 |
|
Al
|
86fe289320
|
[numex] Re-generated numex data file
|
2015-07-13 00:56:48 -04:00 |
|
Al
|
fbef0a15fe
|
[geodb] Adding sparkey dependency
|
2015-07-09 15:26:11 -04:00 |
|
Al
|
4f1b4756d0
|
[geodb] Adding builder program (requires 11GB disk space and ~4GB RAM to build, but only ~300MB RAM to use after building)
|
2015-07-09 15:25:29 -04:00 |
|
Al
|
8889a5c0c3
|
[geodb] GeoDB memory allocation and I/O
|
2015-07-09 15:01:06 -04:00 |
|
Al
|
2d5641892a
|
[config] lower Bloom filter error rate
|
2015-07-09 14:59:23 -04:00 |
|
Al
|
20c6436e6d
|
[geodisambig] Return success if admin1/admin2 IDs are 0
|
2015-07-09 04:19:49 -04:00 |
|
Al
|
20303ad94f
|
[geohash] Adding bounds checks from python-geohash
|
2015-07-09 04:13:53 -04:00 |
|
Al
|
722904ce59
|
[fix] geoname_clear needs to clear feature code as well
|
2015-07-09 03:08:52 -04:00 |
|
Al
|
14500f8c7e
|
[config] Adding GeoDB default bloom filter size and error rate
|
2015-07-08 20:50:52 -04:00 |
|
Al
|
0e2a0aa56d
|
[geodisambig] adding new methods to header
|
2015-07-08 19:05:08 -04:00 |
|
Al
|
ce54a2146b
|
[fix] geo disambiguation features
|
2015-07-08 19:03:39 -04:00 |
|
Al
|
fc32a66d95
|
[fix] geonames I/O
|
2015-07-08 19:02:45 -04:00 |
|
Al
|
8c02073b54
|
[geonames] Adding country_geonames_id to both geoname and postal code structs
|
2015-07-08 18:44:21 -04:00 |
|
Al
|
9af0b0ab65
|
[geodisambig] adding a few more features to geonames disambiguation
|
2015-07-08 18:43:28 -04:00 |
|
Al
|
742079cc6a
|
[geonames] Re-generating postal/geonames fields headers
|
2015-07-08 17:02:59 -04:00 |
|
Al
|
b76f9e47d1
|
[utils] max string size for int8_t and int16_t
|
2015-07-08 16:46:12 -04:00 |
|
Al
|
c0a5607f5e
|
[fix] Adding NUM_BOUNDARY_TYPES for enumeration purposes
|
2015-07-08 16:43:57 -04:00 |
|
Al
|
24835fd088
|
[geonames] namespace specificity
|
2015-07-07 03:38:48 -04:00 |
|
Al
|
af1a5f6213
|
[trie] trie_set_data_node method
|
2015-07-07 03:38:17 -04:00 |
|
Al
|
53908ac604
|
[config] Adding geonames dir as a separate #define
|
2015-07-06 17:09:02 -04:00 |
|