Commit Graph

2098 Commits

Author SHA1 Message Date
Al
dd561ba5b2 [fix] import 2016-05-31 01:42:27 -04:00
Al
913d448db1 [openaddresses] OpenAddresses address formatter, using the config 2016-05-31 01:41:16 -04:00
Al
aeddcb5606 [openaddresses] OpenAddresses config specifying a few files 2016-05-31 01:40:21 -04:00
Al
fb80a345c5 [openaddresses] Fetch script for OpenAddresses 2016-05-31 01:39:04 -04:00
Al
4360f8b698 [addresses] Making address_language a classmethod 2016-05-31 01:20:05 -04:00
Al
4e66f81d1b [intersections] Only requiring a tag to share at least two ways 2016-05-30 23:10:04 -04:00
Al
6f34559f2d [intersections] Adding intersections to config 2016-05-30 23:08:00 -04:00
Al
ce55fb2a34 [fix] name 2016-05-30 23:06:45 -04:00
Al
27d2a05a27 [fix] input file 2016-05-30 22:12:56 -04:00
Al
eff7ad9163 [fix] args 2016-05-30 22:12:39 -04:00
Al
b1c81f9405 [fix] add ways db dir 2016-05-30 22:07:01 -04:00
Al
01fd66f4eb [fix] name 2016-05-30 22:01:17 -04:00
Al
2dcdc741a2 [fix] logging for intersections data 2016-05-30 22:00:28 -04:00
Al
ebd8101b57 [fix] import 2016-05-30 22:00:14 -04:00
Al
fcf55720dc [fix] import 2016-05-30 21:58:12 -04:00
Al
3825c36523 [intersections] intersections training data 2016-05-30 21:50:45 -04:00
Al
8cb0c5ee8b [intersections] Adding places to intersection template, intersection phrase generator 2016-05-30 21:18:05 -04:00
Al
006d15dbac [fix] import 2016-05-30 14:53:55 -04:00
Al
5c92185e71 [tokenization] Reverting commit for tokenizing initial/final apostrophes as part of words as it may be more effective to handle during post-processing 2016-05-30 12:45:58 -04:00
Al
b23f07b679 [parser] Using new geonames designations in parser features 2016-05-29 01:40:45 -04:00
Al
bbddfe25bf [parser] Using NFC normalization for parser as well, @ sign not defined as separator since it may also be used in intersections 2016-05-29 01:37:38 -04:00
Al
1ac077914b [geodb] Adding separate bitset for geonames place types and using NFC normalization instead of NFD (requires retraining) 2016-05-29 01:36:18 -04:00
Al
1d1ada1bc1 [normalize] Adding NORMALIZE_STRING_COMPOSE for NFC unicode normalization 2016-05-28 19:25:12 -04:00
Al
1fd57fdda3 [tokenization] Adding ability to tokenize 's Gravenhage 2016-05-28 19:24:19 -04:00
Al
514aaf7377 [fix] warnings/size_t in libpostal.c 2016-05-28 19:19:31 -04:00
Al
c0e8578b9c [gazetteers] Adding new gazetteer types/address components 2016-05-28 19:19:18 -04:00
Al
acd97a0081 [dictionaries] Adding letra to Spanish numbered unit dictionaries 2016-05-28 19:15:02 -04:00
Al
bac86be6a3 [dictionaries] Adding new dictionary types to generator script 2016-05-28 17:16:43 -04:00
Al
cff23c77ab [boundaries] Adding Bucharest sectors as city_district 2016-05-27 20:22:56 -04:00
Al
5e0e22a666 [dictionaries] More dictionary refactoring 2016-05-27 19:40:20 -04:00
Al
5590c89a5e [addresses] Allowing null_phrase_probability for alpha, and alpha+digits instead of just for ordinals (mostly for Spain) 2016-05-27 13:40:38 -04:00
Al
bdd6d99f56 [addresses] Adding increasing null_phrase_probability for plain numerics in Spain so things like 2o B make it into the training data 2016-05-27 13:37:48 -04:00
Al
cc453cfbbd [places] setting probability of including island to 0.5 for Hawaii, 0.8 seems too high given all the Honolulu, HI addresses (not often seen as Honolulu, Oahu, HI) 2016-05-27 11:32:52 -04:00
Al
f69d9e2e1c [dictionaries] Italian CAP abbreviations 2016-05-27 11:31:16 -04:00
Al
fc96cf145f [dictionaries] Russian place names 2016-05-27 11:28:50 -04:00
Al
ec0df1410b [dictionaries] Adding more fleshed out Greek dictionaries from a recent Nominatim NameFinder wiki update 2016-05-27 11:28:23 -04:00
Al
dccbdc4ccc [dictionaries] Refactoring existing unit_types/level_types dictionaries to use the new more granular dictionary structure 2016-05-27 11:27:34 -04:00
Al
572759885f [parser] Sample chain store alternate names from the cross-language dictionary 2016-05-26 12:09:10 -04:00
Al
5daa64faef [parser] Fixing config keys so OSM streets/venues get abbreviated. Selecting namespaced address fields in cases like Brussels or Hong Kong where everything is bilingual. Adding the ability to pass a known language into address component expansion 2016-05-26 12:05:46 -04:00
Al
206a471732 [fix] loading transliteration module in address_parser_test.c as well 2016-05-25 19:54:01 -04:00
Al
34f5d833a2 [fix] ON needs to be quotes in YAML, uppercase Yukon abbreviation 2016-05-25 19:12:15 -04:00
Al
f59150b047 [fix] cstring_array_split calls 2016-05-25 17:58:30 -04:00
Al
5065917f41 [fix] brace 2016-05-25 17:52:00 -04:00
Al
679d3efcdc [parser] Ignore multiple spaces in parser input post-normalization. If normalizing the string creates several distinct tokens (namely in Vulgar fractions e.g. ½ => 1/2), add all the sub-tokens with the same label as the parent 2016-05-25 17:50:29 -04:00
Al
370744ccfd [utils] Adding cstring_array_split_ignore_consecutive 2016-05-25 17:07:20 -04:00
Al
5c7d24c71b [fix] calls and NULL checks 2016-05-25 15:50:53 -04:00
Al
349df20720 [fix] tokenized_string_t should copy its source string 2016-05-25 15:48:03 -04:00
Al
00784a897d [fix] Need to load transliteration module for Latin-ASCII normalization 2016-05-25 15:25:34 -04:00
Al
bf50d27b0e [places] Adding Town of to English prefixes 2016-05-25 11:23:31 -04:00
Al
5a88294dbc [parser] lower full-name probability for states 2016-05-25 00:47:36 -04:00