Commit Graph

1708 Commits

Author SHA1 Message Date
Al
ced8f9ae27 [parser] Ignore multiple spaces in parser input post-normalization. If normalizing the string creates several distinct tokens (namely in Vulgar fractions e.g. ½ => 1/2), add all the sub-tokens with the same label as the parent 2016-12-12 11:37:27 -05:00
Al
b1816e9b70 [utils] Adding cstring_array_split_ignore_consecutive 2016-12-12 11:37:27 -05:00
Al
6baa7087fe [fix] calls and NULL checks 2016-12-12 11:37:27 -05:00
Al
5e07f5e8c5 [fix] tokenized_string_t should copy its source string 2016-12-12 11:37:27 -05:00
Al
521a094a47 [fix] Need to load transliteration module for Latin-ASCII normalization 2016-12-12 11:37:27 -05:00
Al
d575caba8a [data] using UTC for libpostal data files on the Mac version of the download script as well 2016-12-09 19:43:05 -05:00
Al
c3f3896b48 [fix] update test for date function in data download script 2016-12-09 19:29:00 -05:00
Al
14fa8a08c0 [fix][ci skip] attempting something less cluttered for the readme 2016-10-07 00:50:36 -04:00
Al
6ce05812fe [docs][ci skip] edit to intro/project description 2016-10-06 23:59:26 -04:00
Al
5f7bf6008a [fix][ci skip] cliffhanger, paragraph order 2016-10-06 23:49:42 -04:00
Al
5a571b1d7a [docs][ci skip] moving flags below intro paragraph in readme 2016-10-06 23:45:08 -04:00
Al
de99120c66 [fix][ci skip] alignment of flags on readme 2016-10-06 23:23:23 -04:00
Al
425bca6149 [docs][ci skip] two rows of flags on the readme 2016-10-06 23:12:10 -04:00
Al
906bd524c3 [fix][ci skip] removing comments 2016-10-06 23:00:49 -04:00
Al
a588230d13 Merge branch 'master' of https://github.com/openvenues/libpostal 2016-10-06 22:55:57 -04:00
Al
527b78ddf7 [docs][ci skip] adding more flags to the repo via span tags 2016-10-06 22:55:36 -04:00
Travis
04f8130c46 [auto][ci skip] Adding data files from Travis build #168 2016-10-07 00:46:48 +00:00
Al
8a8b4b6ee9 Merge branch 'Jeffrey04-ms-dictionary-expansion' 2016-10-06 20:31:03 -04:00
Al
03d0afb820 [fix] removing level types and given names from synonyms since they're already covered 2016-10-06 20:30:48 -04:00
Al
5f42e66f31 [fix] removing road/rd from the synonyms list for jalan as they're covered by the English dictionaries 2016-10-06 20:29:35 -04:00
Al
c4e147ed20 [fix] separating words that have different roots 2016-10-06 20:29:09 -04:00
Al
2c48acd680 [dictionaries] removing flat/rumah pangsa/pangsapuri from place_names, aliasing gim to gimnasium rather than the other way around, removing duplicate/mixed English + Malay line 2016-10-06 20:28:44 -04:00
Al
244dbbdd4a [fix] separating synonyms that are for different words 2016-10-06 20:27:15 -04:00
jeffrey04
b2305b574d removing english abbr 2016-10-04 11:30:28 +08:00
jeffrey04
57210bd657 each term should be in separate lines 2016-10-04 11:30:09 +08:00
jeffrey04
f5477a7369 each term should be in a separate line 2016-10-04 11:29:28 +08:00
jeffrey04
8ae8340bee remove shopping mall from list 2016-09-30 10:18:04 +08:00
jeffrey04
f43ba7fe63 removing english words from dictionary 2016-09-30 10:14:25 +08:00
jeffrey04
20b87ba5c8 removing ambiguous_expansion(s).txt 2016-09-30 10:01:13 +08:00
jeffrey04
2bae8075b0 initial commit of malay words 2016-09-28 18:41:15 +08:00
Al
01afbf80ef [data] Each curl process will retry the chunk up to 3 times 2016-08-25 23:18:39 -04:00
Travis
de1255af00 [auto][ci skip] Adding data files from Travis build #161 2016-08-23 22:48:20 +00:00
Al Barrentine
f03df6aab8 Merge pull request #108 from petacat/patch-5
Update toponyms.txt
2016-08-23 18:38:08 -04:00
Travis
f19c9852aa [auto][ci skip] Adding data files from Travis build #160 2016-08-23 22:24:19 +00:00
Travis
d797d6c863 [auto][ci skip] Adding data files from Travis build #159 2016-08-23 22:14:07 +00:00
Al Barrentine
d1991848a3 Merge pull request #106 from petacat/patch-3
Update place_names.txt
2016-08-23 18:09:47 -04:00
Al Barrentine
964b440380 Merge pull request #104 from petacat/patch-1
Update directionals.txt
2016-08-23 17:49:36 -04:00
Thomas Rosen
a787c25cdf Update toponyms.txt 2016-08-23 23:09:32 +02:00
Thomas Rosen
7e258f2d87 Update place_names.txt 2016-08-23 23:03:31 +02:00
Thomas Rosen
bd109dc9ca Update directionals.txt 2016-08-23 22:56:56 +02:00
Al
757a7ee15f [docs][ci skip] Moving parser examples up so they come before normalization 2016-08-10 01:16:07 -04:00
Al
7ff8e1a5cb [docs][ci skip] Moving OpenCollective folks to the top of the README 2016-08-10 01:14:45 -04:00
Al Barrentine
a277096c96 Merge pull request #72 from piamancini/patch-1
Added backers and sponsors from OpenCollective
2016-08-09 23:05:45 -04:00
Al Barrentine
3e3950b37a Merge pull request #98 from uberbaud/posix_sh
Use posix `sh` for systems without `bash`
2016-07-27 18:44:11 -04:00
Tom Davis
18c8e90eb3 Use xargs to start workers as soon as possible 2016-07-27 17:46:44 -04:00
Tom Davis
11abf6cb22 Use posix sh for systems without bash 2016-07-26 20:17:18 -04:00
Al Barrentine
65c4688f89 Merge pull request #97 from uberbaud/multipart_edgecase
Don't call `download_multipart` for 1 chunk
2016-07-24 00:03:51 -04:00
Travis
3f0eff228e [auto][ci skip] Adding data files from Travis build #145 2016-07-23 22:28:32 +00:00
Al
bedfd34363 [fix] small change to dictionary so generated file rebuilds 2016-07-23 18:18:36 -04:00
Al
e8beca0971 [fix] ReEscape backslash when escaping dictionary files 2016-07-23 18:16:44 -04:00