Commit Graph

5003 Commits

Author SHA1 Message Date
Al
f507102457 [dictionaries] removing English words from Indonesian unit types 2017-05-23 18:01:47 -04:00
Al
4b24699e1f [fix] changing national to nasional in Indonesian 2017-05-23 18:00:20 -04:00
Al
4df48fb412 [dictionaries] moving Kampong to normalize to Kampung in Indonesian, better if there's one canonical form 2017-05-23 17:57:38 -04:00
Al
ec79c610eb [dictionaries] removing a few English words and dupes from Indonesian place names 2017-05-23 17:55:59 -04:00
Al
77365a56a5 [dictionaries] removing no fixed address from Indonesian dictionaries 2017-05-23 17:51:15 -04:00
Al
8a35cfcd80 [dictionaries] removing level/platform/podium from Indonesian level types 2017-05-23 17:50:50 -04:00
Al
364b00da01 [dictionaries] separating Mas and Abang 2017-05-23 17:46:45 -04:00
Al
83378049ee [dictionaries] remove Doktor from academic degrees in Indonesian dictionaries 2017-05-23 17:35:53 -04:00
Al
52593c6374 [dictionaries] remove nonprofit from Indonesian company types 2017-05-23 17:27:11 -04:00
Al
08524f4b07 [dictionaries] moving some of the existing chain stores for Indonesia to the all/chains.txt dictionary 2017-05-23 17:25:59 -04:00
Al
18b2fb0ec8 Merge branch 'master' of https://github.com/bbraunay/libpostal into bbraunay-master 2017-05-23 17:18:37 -04:00
Yanuar Budi Baskoro
695756d484 [dictionaries] add more option on toponyms 2017-05-21 16:56:14 +07:00
Yanuar Budi Baskoro
03be9eea49 [dictionaries] Remove additional english words from ID dictionary 2017-05-21 15:58:02 +07:00
Yanuar Budi Baskoro
09cb28cb14 [dictionaries] Remove english words from ID dictionary 2017-05-21 15:39:47 +07:00
Al Barrentine
b79934394a Merge pull request #204 from iestynpryce/master
Fix log_{debug,info} formats which expect size_t but receive int.
2017-05-20 21:28:28 -04:00
Yanuar Budi Baskoro
3b2fb597fe [dictionaries] Fix blank synonym in numbers 2017-05-20 01:04:12 +07:00
Yanuar Budi Baskoro
7f14dafd21 [dictionaries] Fix blank synonym in academic degrees 2017-05-20 01:00:28 +07:00
Yanuar Budi Baskoro
2514580611 [dictionaries] Indonesian dictionaries to support new config 2017-05-19 18:44:32 +07:00
Yanuar Budi Baskoro
60cde05c3d [dictionaries] Indonesian dictionaries to support new config 2017-05-19 18:39:48 +07:00
Iestyn Pryce
87a76bf967 Fix log_{debug,info} formats which expect size_t but receive int. 2017-05-17 22:40:53 +01:00
Al Barrentine
2a0fb69ae5 Merge pull request #201 from iestynpryce/master
Fix log_debug formats which expect unsigned int but receive size_t
2017-05-14 20:53:15 -04:00
Iestyn Pryce
f34fc56fec Fix log_debug formats which expect unsigned int but receive size_t 2017-05-14 17:48:26 +01:00
Al
a7e67c4967 [fix] adding maximum number of permutations for libpostal_expand_address to consider (n=100 for both the inner and outer loop, so max strings=10000), fixes #200 2017-05-13 14:11:08 -04:00
Al
5780a08b48 [fix] check that possible ordinal suffix also has non-zero digit length before normalizing 2017-05-12 15:48:20 -04:00
Al
cea3ced533 [fix] open files in binary format for #69 2017-05-03 17:34:38 -04:00
Al
6ea2273263 [fix] terminate the char_array if input token is zero-length in add_normalized_token 2017-04-28 11:25:07 -04:00
Al Barrentine
04eb2d4539 Merge pull request #189 from openvenues/fix_trie_search
Reset to root node in trie search on partial failed matches before rolling back pointer
2017-04-21 14:39:03 -04:00
Al
278679b7fb [fix] in tokenized trie_search, in the case of a partial failed match, reset to the root node before rolling the pointer back to phrase start + 1 2017-04-21 13:51:07 -04:00
Travis
074b6ff802 [auto][ci skip] Adding data files from Travis build #231 2017-04-20 02:39:39 +00:00
Al Barrentine
004d3d98c9 Merge pull request #187 from openvenues/degree_symbol_ordinal_suffix
Ordinal suffix tests
2017-04-19 22:29:10 -04:00
Al
7bce358ca6 [fix] whitespace in numex config to trigger build 2017-04-19 21:14:54 -04:00
Al
676fb9bcbc [fix] no parens in travis config grep for numex change detection 2017-04-19 21:14:19 -04:00
Al
86956db055 [fix] adding numex change to trigger build 2017-04-19 21:00:59 -04:00
Al
e81580287d [test] adding tests for ordinal suffix normalization 2017-04-19 20:59:36 -04:00
Al
85297f3333 [fix] numex change detection in Travis build 2017-04-19 20:58:08 -04:00
Travis
4762ff2638 [auto][ci skip] Adding data files from Travis build #228 2017-04-20 00:51:42 +00:00
Al Barrentine
e92c3c2867 Merge pull request #186 from openvenues/degree_symbol_ordinal_suffix
Degree symbol ordinal suffix
2017-04-19 20:39:22 -04:00
Al
f3adde746e [numex] adding ability to handle handle the degree symbol in numex parsing since it's technically a separate token 2017-04-19 20:18:21 -04:00
Al
19899b2f7d [dictionaries] adding degree symbol "°" variant for any surface forms that have "º" 2017-04-19 19:25:25 -04:00
Al
c968dd4ecc [numex] adding "°" as additional ordinal suffix for Spanish, Italian, and Portuguese 2017-04-19 19:22:28 -04:00
Al Barrentine
254f3622ea Merge pull request #185 from Ironholds/master
Remove unused variable
2017-04-19 09:08:59 -04:00
Oliver Keyes
18a5d06427 Merge pull request #1 from Ironholds/Ironholds-patch-1
Remove unused variable
2017-04-18 21:53:24 -07:00
Oliver Keyes
35821f975e Remove unused variable
What it says on the tin!
2017-04-18 21:25:00 -07:00
Al Barrentine
e0c82b5edb Merge pull request #184 from openvenues/remove_ordinal_suffix
Remove ordinal suffixes in libpostal_expand_address
2017-04-18 22:33:00 -04:00
Al
9cd3ec37f9 [build] rebuild numex table in Travis if either the configs change or numex_table_builder.c changes 2017-04-18 21:42:09 -04:00
Al
f3cf119e58 [build] Makefile changes to support moving numeric expression parsing to normalize.c 2017-04-18 21:41:24 -04:00
Al
cddc368533 [numex] adding one form of normalization which strips ordinal suffixes so {96th, Ninety-sixth} => 96. This is an additional form of normalization, so there's still one form where the suffixes are kept. One case that's still not handled is something like "IXe Arrondissement" 2017-04-18 21:39:54 -04:00
Al
92051863ba [numex] adding ordinal suffixes themselves to the numex trie so they can be removed from strings 2017-04-18 17:20:02 -04:00
Al Barrentine
63ac3cf921 Merge pull request #183 from openvenues/cdn
Hosting model files and training data on CloudFront CDN
2017-04-17 14:39:35 -04:00
Al
d2732922c2 [data] deployed model files and training data to CloudFront for easier downloading around the world and in places like China where the Great Fire Wall may prevent large downloads from abroad. TTL is set to 0 so it still caches the files themselves but checks with origin for the If-Modified-Since headers, allowing the files to be updated dynamically 2017-04-17 14:11:44 -04:00