Al
|
0fa1c2389c
|
[fix] Leak in expanding strings that have a separable prefix and suffix, other than that ran through 78 million expansions with no discernable memory issues
|
2015-12-26 17:19:59 -05:00 |
|
Al
|
5439f4679f
|
[fix] Special tokens like emails/urls/phone numbers bypass normalization
|
2015-12-20 03:07:36 -05:00 |
|
Al
|
cf2a0efa11
|
[fix] Prefixes and suffixes that are the same length as the original token should be handled as regular expansions
|
2015-12-19 17:29:26 -05:00 |
|
Al
|
97906c86a8
|
[fix] Strip punctuation in final output in cases where there are no expansions
|
2015-12-19 02:10:41 -05:00 |
|
Al
|
4497c4501e
|
[fix] do not add a token if prefix/suffix expansions are inseparable and canonical
|
2015-12-19 01:36:02 -05:00 |
|
Al
|
b4a8a69226
|
[expansion] Fixing extra space on prefix/suffix expansions
|
2015-12-18 20:28:59 -05:00 |
|
Al
|
b9bf5c629e
|
[fix] Moving address_parser_response_destroy into libpostal so caller can free
|
2015-12-15 00:52:24 -05:00 |
|
Al
|
406f9c533d
|
[api] Separating parser setup/teardown into two separate methods
|
2015-12-14 18:15:57 -05:00 |
|
Al
|
dc03c83bb2
|
[math] Adding an aligned memory allocator for vectors to help with vectorization/SIMD
|
2015-12-14 14:56:38 -05:00 |
|
Al
|
88836e56e1
|
[api] Adding parse_address implementation to the libpostal API. GeoDB and address parser are now required. Stripping punctuation from the normalized output
|
2015-12-12 12:47:44 -05:00 |
|
Al
|
2fcc72ae07
|
[fix] multitoken canonical strings
|
2015-12-08 15:38:04 -05:00 |
|
Al
|
d35f519629
|
[expansion] Fixing case where non-ideographic tokens like # can potentially be concatenated with surrounding tokens and should normalized with whitespace in between
|
2015-12-07 19:18:46 -05:00 |
|
Al
|
0d8d396108
|
[expansion] Fixing cases like ML King where a global (all languages) expansion subsumes the specific language expansion (like English)
|
2015-12-07 18:09:25 -05:00 |
|
Al
|
9bab70909d
|
[numex] Always adding a version of the string without Roman numeral expansion since many times those tokens can be ambiguous
|
2015-12-07 14:29:18 -05:00 |
|
Al
|
43287db90a
|
[normalization/phrases] Fixing a bug which occurs with an already-separated elision
|
2015-12-02 16:04:39 -05:00 |
|
Al
|
1a1d74785c
|
[fix] Compiler warnings for casts/printf
|
2015-10-26 18:52:18 -04:00 |
|
Al
|
3cba2e8df3
|
[api] Using default setup methods for submodules in libpostal setup
|
2015-09-15 14:01:33 -04:00 |
|
Al
|
b2f690b6f6
|
[api] Error logging if modules can't be found
|
2015-09-15 13:21:15 -04:00 |
|
Al
|
c29cf5ac9a
|
[api] Better handling of strings with multiple scripts and strings that use more than one transliterator. Reducing complexity/allocations
|
2015-08-10 17:51:41 -04:00 |
|
Al
|
78a80dd86e
|
[api] Add separable or inseparable non-canonical string affixes (e.g. foobg. => fooburg, foostrasse => foostraße|foo straße, l'ensemble => l' ensemble, etc.) in expand_address
|
2015-08-10 16:19:03 -04:00 |
|
Al
|
53f54d6454
|
[fix] removing comment
|
2015-08-08 20:23:49 -04:00 |
|
Al
|
06d2e916a1
|
[fix] includes, matters on GCC/Linux
|
2015-08-07 01:51:34 -04:00 |
|
Al
|
f246c2ee95
|
[api] Adding address component constants to libpostal.h, returning char ** instead of a cstring_array to simplify API/dependencies
|
2015-08-06 17:52:54 -04:00 |
|
Al
|
753c6efb1d
|
[api] Initial libpostal API, combining string normalization, transliteration, numex and address dictionaries
|
2015-08-02 21:16:18 -06:00 |
|