Al B
|
fcbb13cad0
|
Merge pull request #391 from edding/fix_memory_leak
fix memory leak in setup when datadir is invalid or setup failed
|
2025-02-08 12:03:59 -05:00 |
|
Al
|
0540d7c7e3
|
[api/compat] PR #465 redefined the language classifier response struct in the API and was casting between incompatible pointer types. Using the exported struct throughout.
|
2025-01-30 01:45:18 -05:00 |
|
Al
|
26124ee72f
|
[near_dupes] exposing name_word_hashes directly in the API
|
2022-03-25 14:04:26 -04:00 |
|
Luiz Otavio V. B. Oliveira
|
0327150d2b
|
Exposes language classification functions
|
2019-06-14 14:31:12 +02:00 |
|
Edward Ding
|
363e83304a
|
fix memory leak in setup when datadir is invalid or setup failed
|
2018-10-26 16:07:57 -07:00 |
|
Al
|
c5bb9d8daa
|
[normalize/api] exposing normalize_string_languages and normalized_tokens_languages to the API for pre-normalizing numeric expressions at tokenization time
|
2018-02-22 18:47:36 -05:00 |
|
Al
|
86d5eca521
|
[api] checking for NULL responses in the cstring_array methods before converting them to char arrays
|
2017-12-30 02:31:25 -05:00 |
|
Al
|
6dff154a99
|
[api] adding APIs for getting default options and using a consistent naming convention
|
2017-12-29 17:48:54 -05:00 |
|
Al
|
8495cda1eb
|
[api] adding pairwise-dupe functions/structs to the public header
|
2017-12-29 13:48:54 -05:00 |
|
Al
|
1f1412c120
|
[api] adding libpostal_place_languages method to public API for classifying languages consistently from components (may need to make several calls using the same languages and don't necessarily want the language classifier to be run on house numbers when we already know the languages from e.g. the street name - this provides a simple window into the language classifier focused on the entire address/record
|
2017-12-29 03:32:41 -05:00 |
|
Al
|
f3a626463a
|
[api] adding API functions for near dupe hashes to the public header
|
2017-12-24 12:43:28 -05:00 |
|
Al
|
8b2a4d1ecf
|
[api] adding libpostal_expand_address_root to the public API. This will attempt to delete tokens that can be safely ignored. It's deterministic and rule-based, but is informed by libpostal's fairly comprehensive dictionaries, and should work relatively well across languages for deduping purposes.
|
2017-12-17 17:46:26 -05:00 |
|
Al
|
8968a6c966
|
[expand] moving expand to its own module so the internal methods can be exposed, calling from libpostal.c
|
2017-12-08 16:26:13 -05:00 |
|
Al
|
ec4d683d1b
|
Merge branch 'master' into lieu_api
|
2017-11-29 15:49:52 -05:00 |
|
AeroXuk
|
9090811826
|
Modifed the libpostal API to add an extra function libpostal_parser_print_features to toggle debugging info. Updated address_parser app to use the new function.
|
2017-11-27 19:20:37 +00:00 |
|
AeroXuk
|
26ac9ab5c2
|
Removing EXPORT statements from all source files and most header files, leaving only the exports for the main API in libpostal.h. Modified Makefiles so that all the test apps build without having extra functions exported from libpostal.
|
2017-11-25 04:35:28 +00:00 |
|
AeroXuk
|
2d3b420d35
|
Merging changes from AeroXuk/libpostal_windows.
|
2017-11-19 12:44:38 +00:00 |
|
Al
|
053dca82ba
|
[expand] adding a normalization for a single non-acronym internal period where there's an expansion at the prefix/suffix (for #218 and https://github.com/openvenues/libpostal/issues/216#issuecomment-306617824). Helps in cases like "St.Michaels" or "Jln.Utara" without needing to specify concatenated prefix phrases for every possibility
|
2017-10-28 02:38:15 -04:00 |
|
Al
|
5c927e780f
|
[expand] adding ability to expand Roman numerals with ordinal suffixes like IXe in French
|
2017-10-20 02:51:26 -04:00 |
|
Al
|
448ca6a61a
|
[merge] merging commit from v1.1
|
2017-10-12 01:41:04 -04:00 |
|
Al
|
0c6af2b74c
|
[fix] normalize canonical strings (after expanding abbreviations, concatenated suffixes, etc.) with Latin-ASCII, Latin-ASCII-Simple or simple UTF-8 normalization depending on the options
|
2017-08-03 14:08:05 -06:00 |
|
Iestyn Pryce
|
ecd07b18c1
|
Fix log_* formats which expect size_t but receive uint32_t.
|
2017-05-19 22:31:56 +01:00 |
|
Iestyn Pryce
|
f34fc56fec
|
Fix log_debug formats which expect unsigned int but receive size_t
|
2017-05-14 17:48:26 +01:00 |
|
Al
|
a7e67c4967
|
[fix] adding maximum number of permutations for libpostal_expand_address to consider (n=100 for both the inner and outer loop, so max strings=10000), fixes #200
|
2017-05-13 14:11:08 -04:00 |
|
Al
|
5780a08b48
|
[fix] check that possible ordinal suffix also has non-zero digit length before normalizing
|
2017-05-12 15:48:20 -04:00 |
|
Al
|
f3adde746e
|
[numex] adding ability to handle handle the degree symbol in numex parsing since it's technically a separate token
|
2017-04-19 20:18:21 -04:00 |
|
Al
|
cddc368533
|
[numex] adding one form of normalization which strips ordinal suffixes so {96th, Ninety-sixth} => 96. This is an additional form of normalization, so there's still one form where the suffixes are kept. One case that's still not handled is something like "IXe Arrondissement"
|
2017-04-18 21:39:54 -04:00 |
|
Al
|
8742574257
|
[parser] storing address_parser_context on the parser struct itself so it doesn't have to be allocated every time
|
2017-04-04 20:40:55 -04:00 |
|
Al
|
6d4c7984df
|
[api] doing this now since we're bumping a major version. Using a libpostal prefixes for all public header functions and definitions
|
2017-03-31 03:35:51 -04:00 |
|
Al
|
a3e51db32d
|
[api] include some of the new components in default address_components for the libpostal expansion API
|
2017-02-15 22:29:22 -05:00 |
|
Al
|
9a93e95938
|
[api] removing geodb from setup functions
|
2017-02-10 01:02:52 -05:00 |
|
Al
|
b320aed9ac
|
[merge] merging master
|
2017-01-13 19:58:49 -05:00 |
|
Al
|
a3506131fe
|
[build] adding libpostal_setup_datadir, libpostal_setup_parser_datadir, libpostal_setup_language_classifier_datadir functions for configuring the datadir at runtime
|
2017-01-09 16:11:26 -05:00 |
|
Al
|
58b063b632
|
[strings] making string_tree_iterator_done more meaningful (returns true if the iterator has no paths left to traverse)
|
2016-12-31 00:54:36 -05:00 |
|
Al
|
091167ed3c
|
[api] remove geodb from libpostal.c
|
2016-12-29 02:35:43 -05:00 |
|
Al
|
eea11beb6a
|
[expansion] using easier-to-access data structure for address dictionaries
|
2016-11-27 00:56:48 -08:00 |
|
Al
|
2e8888e331
|
[fix] warnings/size_t in libpostal.c
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
83381e9d8a
|
[expand] Adding exception for a few types of special punctuation (ampersand, plus, pound sign) which should be left in the original string and separated by whitespace. Closes #84. Closes #85
|
2016-07-17 15:02:47 -04:00 |
|
Al
|
ce78064988
|
[fix] NULL checks
|
2016-07-15 13:23:23 -04:00 |
|
Al
|
58a5dbe7e0
|
[logging] Logging the value of LIBPOSTAL_DATA_DIR when a setup error occurs
|
2016-07-01 14:51:04 -04:00 |
|
Al
|
9819ebf949
|
[fix] always include expansions in the ambiguous expansion dictionary, no matter which component
|
2016-04-29 13:26:13 -04:00 |
|
Al
|
14e8f50cf1
|
[fix] Expansions when passing in the address_components= option. Was only limiting results at the phrase level, should work at the individual expansion level
|
2016-03-29 16:46:29 -04:00 |
|
Al
|
37c09d1ed9
|
[api] Adding function to free expansions from expand_address
|
2016-02-16 10:56:45 -05:00 |
|
Al
|
98165e89ad
|
[api] Using bools instead of bit fields in the public API
|
2016-02-15 18:33:39 -05:00 |
|
Al
|
cf2a79bef1
|
[api] Default options accessible through getters, not static structs
|
2016-02-15 17:34:00 -05:00 |
|
Al
|
84d5ba18f0
|
[api] Fixing multi-language expansions with overlapping expansions, whitespace, utf8 normalization of canonical strings
|
2016-02-08 02:50:34 -05:00 |
|
Al
|
9ac0379a65
|
[phrases] Case where trie search finds a match, makes progress beyond the next token but has to fall back. Adding trie search test case
|
2016-02-08 01:07:56 -05:00 |
|
Al
|
085bfd6ada
|
[fix] static methods for libpostal.c
|
2016-01-30 02:20:59 -05:00 |
|
Al
|
42d169feee
|
[api] Libpostal expand API will now detect language automatically using a high accuracy language classifier trained on OSM streets/addresses/toponyms. Hooray batch geocoding!
|
2016-01-27 03:23:51 -05:00 |
|
Al
|
780966a59b
|
[api] More spacing fixes and using language information in normalize string
|
2015-12-31 03:52:14 -05:00 |
|