Al
|
f161f68d53
|
[build] Changes to Makefile.am to build on Debian/Ubuntu, fixing downloading of the data tarball for Mac and Linux
|
2015-08-07 17:27:34 -04:00 |
|
Al
|
9b69d1f67a
|
[fix] Removing C++ checks from all but the main API functions
|
2015-08-07 17:15:39 -04:00 |
|
Al
|
359a1efb03
|
[fix] Adding stdint.h include to most of the header files for portability
|
2015-08-07 02:43:44 -04:00 |
|
Al
|
0738a57caa
|
[fix] restoring ctype.h include
|
2015-08-07 01:52:08 -04:00 |
|
Al
|
06d2e916a1
|
[fix] includes, matters on GCC/Linux
|
2015-08-07 01:51:34 -04:00 |
|
Al
|
ae9825b9f9
|
[build] Fixing data dir download in Automake file
|
2015-08-07 01:51:06 -04:00 |
|
Al
|
d7ebcd046e
|
[fix] includes
|
2015-08-07 01:00:26 -04:00 |
|
Al
|
f246c2ee95
|
[api] Adding address component constants to libpostal.h, returning char ** instead of a cstring_array to simplify API/dependencies
|
2015-08-06 17:52:54 -04:00 |
|
Al
|
61d586fa1d
|
[config] config.h=>libpostal_config.h so as not to conflict with autoconf
|
2015-08-06 17:50:55 -04:00 |
|
Al
|
2bedb695a2
|
[build] adding Automake file in src, including rule to download data dir tarball
|
2015-08-06 17:48:37 -04:00 |
|
Al
|
4b9f11eca5
|
[build] Main Automake file and modified version of Sparkey's Automake file
|
2015-08-06 02:14:33 -04:00 |
|
Al
|
1d39916aaa
|
[fix] Fixing warnings in unicode script data
|
2015-08-02 21:30:54 -06:00 |
|
Al
|
770ce4256f
|
[expansion] Re-generating address expansion data file
|
2015-08-02 21:30:19 -06:00 |
|
Al
|
753c6efb1d
|
[api] Initial libpostal API, combining string normalization, transliteration, numex and address dictionaries
|
2015-08-02 21:16:18 -06:00 |
|
Al
|
b27030e39f
|
[fix] tokenized trie search was skipping tokens in some cases
|
2015-08-02 14:36:21 -06:00 |
|
Al
|
3178eda501
|
[utils] string_contains_hyphen method
|
2015-08-02 14:35:18 -06:00 |
|
Al
|
46141a6c36
|
[normalize] Adding an option when normalizing tokens to split tokens of the form [\w]+[\.\-]?[\d]+ for cases like I35, CR123, R-66, RN.7, etc. where the alpha component is an expansion
|
2015-08-02 14:34:36 -06:00 |
|
Al
|
f10dd49c58
|
[expansion] NULL_CANONICAL_INDEX constant
|
2015-08-01 23:59:16 -06:00 |
|
Al
|
fe4789a665
|
[fix] compiler warnings
|
2015-07-28 19:14:00 -04:00 |
|
Al
|
551904d202
|
[normalize] cstring_array instead of string_tree for token-based normalization
|
2015-07-28 19:09:50 -04:00 |
|
Al
|
90d4da9e72
|
[geodb] Adding an is_canonical bit field to geodb trie values
|
2015-07-28 19:08:24 -04:00 |
|
Al
|
9bc902f575
|
[numex] LATIN_LANGUAGE_CODE constant for Roman numeral normalization
|
2015-07-28 18:12:12 -04:00 |
|
Al
|
df1410da8c
|
[numex] Fixing numex parsing for lone stopwords and certain prefix matches that were getting mistakenly converted e.g. settembre => 7mbre
|
2015-07-28 18:11:23 -04:00 |
|
Al
|
a16f0dabcb
|
[numex] Fixing hyphen-initial numeric phrases that end the string
|
2015-07-28 03:28:44 -04:00 |
|
Al
|
0f5b69c06b
|
[fix] transition to SEARCH_STATE_NO_MATCH in trie_search_tokens_from_index on a return to the start node
|
2015-07-27 16:35:27 -04:00 |
|
Al
|
243f327928
|
[fix] NULL check
|
2015-07-27 16:32:01 -04:00 |
|
Al
|
7aee159c0c
|
[utils] string_tree_num_tokens
|
2015-07-27 12:36:34 -04:00 |
|
Al
|
b812d90c59
|
[fix] specifying numex dir with cross-platform PATH_SEPARATOR
|
2015-07-27 12:36:06 -04:00 |
|
Al
|
7ff9a6054d
|
[geodb] trim strings in geodb builder
|
2015-07-27 02:37:20 -04:00 |
|
Al
|
053b987d58
|
[normalize] adding an option for string trimming in normalize
|
2015-07-27 01:59:14 -04:00 |
|
Al
|
b94526a27b
|
[utils] Making string_trim handle all kinds of UTF-8 whitespace/separators
|
2015-07-27 01:55:46 -04:00 |
|
Al
|
eab4c554d6
|
[numex] Regenerating numex data file
|
2015-07-27 01:53:13 -04:00 |
|
Al
|
d2539f5b57
|
[numex] Fixing case of hyphen/space-initial phrases in numex, as well as whole token only languages with ordinals
|
2015-07-27 01:44:33 -04:00 |
|
Al
|
8ff4ace63b
|
[phrases] Allowing trie_search to process tokenized input with or without whitespace, and to handle ideographic characters correctly
|
2015-07-26 23:41:57 -04:00 |
|
Al
|
38b10b9dd0
|
[fix] Clearing paths before reuse in geodb_builder
|
2015-07-26 23:36:34 -04:00 |
|
Al
|
93042761ac
|
[fix] warnings in string_utils.c
|
2015-07-26 23:36:03 -04:00 |
|
Al
|
50ee95ff7d
|
[geodb] Adding a msgpack'd list of ids for naked string keys in geodb builder
|
2015-07-25 18:42:13 -04:00 |
|
Al
|
a67ec44a08
|
[utils] cstring_array_terminate, moving msgpack_utils to separate file
|
2015-07-25 18:41:02 -04:00 |
|
Al
|
42f6be7434
|
[fix] county road
|
2015-07-25 14:19:38 -04:00 |
|
Al
|
2ff8c0fd1e
|
[transliteration] fixing length-based transliteration
|
2015-07-25 13:53:28 -04:00 |
|
Al
|
71ffdf9cbc
|
[expansion] tokenized version of search_address_dictionaries
|
2015-07-25 13:50:53 -04:00 |
|
Al
|
ee96dab93c
|
[fix] unnecessary headers
|
2015-07-25 13:49:42 -04:00 |
|
Al
|
e549e76806
|
[utils] string_tree_iterator_foreach_token
|
2015-07-25 13:49:02 -04:00 |
|
Al
|
2adaf475c2
|
[utils] cstring_array (contiguous) to array of malloc'd strings
|
2015-07-25 12:14:01 -04:00 |
|
Al
|
e9277d7339
|
[utils] vector extend method
|
2015-07-25 01:33:45 -04:00 |
|
Al
|
9fb1eae877
|
[expansion] Regenerating address data file
|
2015-07-24 16:09:22 -04:00 |
|
Al
|
351c7c8c2e
|
[expansion] Add concatenated suffixes to the suffix keyspace of the address dictionary trie and concatenated prefixes and elisions to the prefix keyspace
|
2015-07-24 16:02:47 -04:00 |
|
Al
|
90a91cadd0
|
[search] Modifying trie_search_prefixes to use the new key schema
|
2015-07-24 15:59:49 -04:00 |
|
Al
|
bb7688d8d1
|
[phrases] trie_add_prefix method and a schema for prefix keys, e.g. elisions in French and Italian, separable prefixes like Hinter in German, etc.
|
2015-07-24 15:56:19 -04:00 |
|
Al
|
359cd62e20
|
[numex] Adding a replace_numeric_expressions method (returns NULL if no replacements were made), fixing lengths in situations where two unrelated numbers are joined by a stopword e.g. in the phrase "one and one" the "and" acts as a delimiter vs a phrase where the stopword acts as a joiner like "one hundred and twenty"
|
2015-07-24 15:31:05 -04:00 |
|