Al
|
e511eede74
|
[phrases] Prefix/suffix trie search using the new characters, fixing length of matched prefixes/suffixes and exiting early on falling off the the trie
|
2015-08-10 16:02:38 -04:00 |
|
Al
|
51572d6575
|
[phrases] Changing prefix/suffix chars so both are control characters and neither is the NUL-byte. Modifying transliteration special characters accordingly
|
2015-08-10 16:01:22 -04:00 |
|
Al
|
11a9881988
|
[phrases] adding _from_index_get_prefix_char/_from_index_get_suffix_char methods
|
2015-08-09 03:41:20 -04:00 |
|
Al
|
2eb67ad850
|
[phrases] trie_search_prefixes/trie_search_suffixes now take a length param
|
2015-08-09 02:01:37 -04:00 |
|
Al
|
bbaa302e2e
|
[fix] NUMEX_STOPWORD_RULE define
|
2015-08-09 01:03:23 -04:00 |
|
Al
|
5383640c14
|
[fix] cast
|
2015-08-09 01:01:11 -04:00 |
|
Al
|
dd391eabe5
|
[numex] Separating rules from keys for Linux gcc compilation
|
2015-08-09 01:00:57 -04:00 |
|
Al
|
e346b831cb
|
[build] public-read permissions when uploading to S3
|
2015-08-09 00:17:04 -04:00 |
|
Al
|
ad584671c4
|
[build] Not compiling with -Werror for now
|
2015-08-09 00:02:41 -04:00 |
|
Al
|
423e2c86c7
|
[build] builder programs are now in noinst_PROGRAMS, Makefile target to upload data tarball to S3 (with proper credentials)
|
2015-08-08 23:29:34 -04:00 |
|
Al
|
ee982cd872
|
[dictionaries] Removing dictionaries/all/personal_suffixes, can add to languages as needed
|
2015-08-08 23:13:09 -04:00 |
|
Al
|
5acf7a4f3e
|
[phrases] resetting node position when continuation falls off the trie
|
2015-08-08 22:18:05 -04:00 |
|
Al
|
cd0f95f9e2
|
[fix] making transliteration path relative to data dir
|
2015-08-08 21:06:02 -04:00 |
|
Al
|
2ba0e814ad
|
[build] better autoconf checks for time and dirent headers
|
2015-08-08 21:02:03 -04:00 |
|
Al
|
d0679450e3
|
[config] Including Autoconf config.h in internal config
|
2015-08-08 20:50:23 -04:00 |
|
Al
|
5df9e123af
|
[numex] Fix to whole_tokens_only numeric experession parsing where numex was pushing a number onto the stack even on encountering a new rule context even though the token was not completely parsed
|
2015-08-08 20:49:54 -04:00 |
|
Al
|
53f54d6454
|
[fix] removing comment
|
2015-08-08 20:23:49 -04:00 |
|
Al
|
2106a6cfe4
|
[build] Adding command-line test and bench programs
|
2015-08-08 19:44:50 -04:00 |
|
Al
|
5aa2e99b92
|
[fix] data dir for tar extraction
|
2015-08-08 19:42:37 -04:00 |
|
Al
|
54aa6fe7df
|
[build] Fixing runtime check/save of last updated file for package data tarball
|
2015-08-08 17:16:03 -04:00 |
|
Al
|
f38a53601b
|
[rm] Better not to keep that file in the repo
|
2015-08-08 02:41:54 -04:00 |
|
Al
|
770f44198c
|
[build] Adding default file to track last updated date
|
2015-08-08 02:30:42 -04:00 |
|
Al
|
a197d04b1a
|
[fix] float comparison
|
2015-08-07 17:28:21 -04:00 |
|
Al
|
f161f68d53
|
[build] Changes to Makefile.am to build on Debian/Ubuntu, fixing downloading of the data tarball for Mac and Linux
|
2015-08-07 17:27:34 -04:00 |
|
Al
|
9b69d1f67a
|
[fix] Removing C++ checks from all but the main API functions
|
2015-08-07 17:15:39 -04:00 |
|
Al
|
359a1efb03
|
[fix] Adding stdint.h include to most of the header files for portability
|
2015-08-07 02:43:44 -04:00 |
|
Al
|
0738a57caa
|
[fix] restoring ctype.h include
|
2015-08-07 01:52:08 -04:00 |
|
Al
|
06d2e916a1
|
[fix] includes, matters on GCC/Linux
|
2015-08-07 01:51:34 -04:00 |
|
Al
|
ae9825b9f9
|
[build] Fixing data dir download in Automake file
|
2015-08-07 01:51:06 -04:00 |
|
Al
|
d7ebcd046e
|
[fix] includes
|
2015-08-07 01:00:26 -04:00 |
|
Al
|
f246c2ee95
|
[api] Adding address component constants to libpostal.h, returning char ** instead of a cstring_array to simplify API/dependencies
|
2015-08-06 17:52:54 -04:00 |
|
Al
|
61d586fa1d
|
[config] config.h=>libpostal_config.h so as not to conflict with autoconf
|
2015-08-06 17:50:55 -04:00 |
|
Al
|
2bedb695a2
|
[build] adding Automake file in src, including rule to download data dir tarball
|
2015-08-06 17:48:37 -04:00 |
|
Al
|
4b9f11eca5
|
[build] Main Automake file and modified version of Sparkey's Automake file
|
2015-08-06 02:14:33 -04:00 |
|
Al
|
1d39916aaa
|
[fix] Fixing warnings in unicode script data
|
2015-08-02 21:30:54 -06:00 |
|
Al
|
770ce4256f
|
[expansion] Re-generating address expansion data file
|
2015-08-02 21:30:19 -06:00 |
|
Al
|
753c6efb1d
|
[api] Initial libpostal API, combining string normalization, transliteration, numex and address dictionaries
|
2015-08-02 21:16:18 -06:00 |
|
Al
|
b27030e39f
|
[fix] tokenized trie search was skipping tokens in some cases
|
2015-08-02 14:36:21 -06:00 |
|
Al
|
3178eda501
|
[utils] string_contains_hyphen method
|
2015-08-02 14:35:18 -06:00 |
|
Al
|
46141a6c36
|
[normalize] Adding an option when normalizing tokens to split tokens of the form [\w]+[\.\-]?[\d]+ for cases like I35, CR123, R-66, RN.7, etc. where the alpha component is an expansion
|
2015-08-02 14:34:36 -06:00 |
|
Al
|
f10dd49c58
|
[expansion] NULL_CANONICAL_INDEX constant
|
2015-08-01 23:59:16 -06:00 |
|
Al
|
fe4789a665
|
[fix] compiler warnings
|
2015-07-28 19:14:00 -04:00 |
|
Al
|
551904d202
|
[normalize] cstring_array instead of string_tree for token-based normalization
|
2015-07-28 19:09:50 -04:00 |
|
Al
|
90d4da9e72
|
[geodb] Adding an is_canonical bit field to geodb trie values
|
2015-07-28 19:08:24 -04:00 |
|
Al
|
9bc902f575
|
[numex] LATIN_LANGUAGE_CODE constant for Roman numeral normalization
|
2015-07-28 18:12:12 -04:00 |
|
Al
|
df1410da8c
|
[numex] Fixing numex parsing for lone stopwords and certain prefix matches that were getting mistakenly converted e.g. settembre => 7mbre
|
2015-07-28 18:11:23 -04:00 |
|
Al
|
a16f0dabcb
|
[numex] Fixing hyphen-initial numeric phrases that end the string
|
2015-07-28 03:28:44 -04:00 |
|
Al
|
0f5b69c06b
|
[fix] transition to SEARCH_STATE_NO_MATCH in trie_search_tokens_from_index on a return to the start node
|
2015-07-27 16:35:27 -04:00 |
|
Al
|
243f327928
|
[fix] NULL check
|
2015-07-27 16:32:01 -04:00 |
|
Al
|
7aee159c0c
|
[utils] string_tree_num_tokens
|
2015-07-27 12:36:34 -04:00 |
|