Commit Graph

2081 Commits

Author SHA1 Message Date
Al
51831e2111 [fix] add ways db dir 2016-07-21 17:04:57 -04:00
Al
f7680e9b65 [fix] name 2016-07-21 17:04:57 -04:00
Al
0a912766e4 [fix] logging for intersections data 2016-07-21 17:04:57 -04:00
Al
baf8fbb381 [fix] import 2016-07-21 17:04:57 -04:00
Al
b4a70a9a56 [fix] import 2016-07-21 17:04:57 -04:00
Al
8aada7086f [intersections] intersections training data 2016-07-21 17:04:57 -04:00
Al
5075128ada [intersections] Adding places to intersection template, intersection phrase generator 2016-07-21 17:04:57 -04:00
Al
701e67614a [fix] import 2016-07-21 17:04:57 -04:00
Al
2454b98c6d [tokenization] Reverting commit for tokenizing initial/final apostrophes as part of words as it may be more effective to handle during post-processing 2016-07-21 17:04:57 -04:00
Al
0a8f46bdc3 [parser] Using new geonames designations in parser features 2016-07-21 17:04:57 -04:00
Al
c383f8af88 [parser] Using NFC normalization for parser as well, @ sign not defined as separator since it may also be used in intersections 2016-07-21 17:04:57 -04:00
Al
c2ee5a45b3 [geodb] Adding separate bitset for geonames place types and using NFC normalization instead of NFD (requires retraining) 2016-07-21 17:04:57 -04:00
Al
6c39c663ff [normalize] Adding NORMALIZE_STRING_COMPOSE for NFC unicode normalization 2016-07-21 17:04:57 -04:00
Al
757c6147cb [tokenization] Adding ability to tokenize 's Gravenhage 2016-07-21 17:04:57 -04:00
Al
2e8888e331 [fix] warnings/size_t in libpostal.c 2016-07-21 17:04:57 -04:00
Al
e800f21f06 [gazetteers] Adding new gazetteer types/address components 2016-07-21 17:04:57 -04:00
Al
95b239a5f9 [dictionaries] Adding letra to Spanish numbered unit dictionaries 2016-07-21 17:04:57 -04:00
Al
9561f771ce [dictionaries] Adding new dictionary types to generator script 2016-07-21 17:04:57 -04:00
Al
7aa06c4535 [boundaries] Adding Bucharest sectors as city_district 2016-07-21 17:04:57 -04:00
Al
9aeb22bfbc [dictionaries] More dictionary refactoring 2016-07-21 17:04:57 -04:00
Al
6980565698 [addresses] Allowing null_phrase_probability for alpha, and alpha+digits instead of just for ordinals (mostly for Spain) 2016-07-21 17:04:57 -04:00
Al
d4d8fa81d1 [addresses] Adding increasing null_phrase_probability for plain numerics in Spain so things like 2o B make it into the training data 2016-07-21 17:04:57 -04:00
Al
35e73d0e40 [places] setting probability of including island to 0.5 for Hawaii, 0.8 seems too high given all the Honolulu, HI addresses (not often seen as Honolulu, Oahu, HI) 2016-07-21 17:04:57 -04:00
Al
605b7c2b4f [dictionaries] Italian CAP abbreviations 2016-07-21 17:04:57 -04:00
Al
4e8e08086e [dictionaries] Russian place names 2016-07-21 17:04:57 -04:00
Al
8d33b62da2 [dictionaries] Adding more fleshed out Greek dictionaries from a recent Nominatim NameFinder wiki update 2016-07-21 17:04:57 -04:00
Al
0d39cd94c2 [dictionaries] Refactoring existing unit_types/level_types dictionaries to use the new more granular dictionary structure 2016-07-21 17:04:57 -04:00
Al
11d1acc3bc [parser] Sample chain store alternate names from the cross-language dictionary 2016-07-21 17:04:57 -04:00
Al
69e1c846ba [parser] Fixing config keys so OSM streets/venues get abbreviated. Selecting namespaced address fields in cases like Brussels or Hong Kong where everything is bilingual. Adding the ability to pass a known language into address component expansion 2016-07-21 17:04:57 -04:00
Al
e5e0cf3b92 [fix] loading transliteration module in address_parser_test.c as well 2016-07-21 17:04:57 -04:00
Al
8e338c5ffb [fix] ON needs to be quotes in YAML, uppercase Yukon abbreviation 2016-07-21 17:04:57 -04:00
Al
b8d43dc601 [fix] cstring_array_split calls 2016-07-21 17:04:57 -04:00
Al
b19cd3f60a [fix] brace 2016-07-21 17:04:57 -04:00
Al
994b2f18e4 [parser] Ignore multiple spaces in parser input post-normalization. If normalizing the string creates several distinct tokens (namely in Vulgar fractions e.g. ½ => 1/2), add all the sub-tokens with the same label as the parent 2016-07-21 17:04:57 -04:00
Al
b664ab1cea [utils] Adding cstring_array_split_ignore_consecutive 2016-07-21 17:04:57 -04:00
Al
8e90ee45d2 [fix] calls and NULL checks 2016-07-21 17:04:57 -04:00
Al
e3cffaf0d1 [fix] tokenized_string_t should copy its source string 2016-07-21 17:04:57 -04:00
Al
16501aba17 [fix] Need to load transliteration module for Latin-ASCII normalization 2016-07-21 17:04:57 -04:00
Al
b326e209fb [places] Adding Town of to English prefixes 2016-07-21 17:04:57 -04:00
Al
366c4995af [parser] lower full-name probability for states 2016-07-21 17:04:57 -04:00
Al
d88be7ef5d [fix] use simple language code if language_script cannot be found 2016-07-21 17:04:57 -04:00
Al
90467e9098 [fix] global formatter config 2016-07-21 17:04:57 -04:00
Al
16a91528d6 [fix] config key name 2016-07-21 17:04:57 -04:00
Al
d3b936067e [fix] neighborhood reverse geocoder using the new OSM definitions module which keeps track of whatever the data fetching script defines as being a valid {neighborhood, admin boundary, etc.} 2016-07-21 17:04:57 -04:00
Al
b294b891dd [boundaries] lines sharing a point are added to the polygon head-to-tail, reversing the node order as needed, produces accurate OSM polygons for reverse geocoding lookups 2016-07-21 17:04:57 -04:00
Al
75aa713792 [fix] moving language code replacements out of address components 2016-07-21 17:04:57 -04:00
Al
6cb834b3a3 [boundaries] admin_level=8 is city_district in Japan 2016-07-21 17:04:57 -04:00
Al
308080f6ee [formatting] Moving language country overrides to formatter config so actual language is retained 2016-07-21 17:04:57 -04:00
Al
e59e3a173c [fix] place=municipality 2016-07-21 17:04:57 -04:00
Al
3c16973cac [fix] OSM neighborhood ids 2016-07-21 17:04:57 -04:00