Commit Graph

2067 Commits

Author SHA1 Message Date
Al
2e8888e331 [fix] warnings/size_t in libpostal.c 2016-07-21 17:04:57 -04:00
Al
e800f21f06 [gazetteers] Adding new gazetteer types/address components 2016-07-21 17:04:57 -04:00
Al
95b239a5f9 [dictionaries] Adding letra to Spanish numbered unit dictionaries 2016-07-21 17:04:57 -04:00
Al
9561f771ce [dictionaries] Adding new dictionary types to generator script 2016-07-21 17:04:57 -04:00
Al
7aa06c4535 [boundaries] Adding Bucharest sectors as city_district 2016-07-21 17:04:57 -04:00
Al
9aeb22bfbc [dictionaries] More dictionary refactoring 2016-07-21 17:04:57 -04:00
Al
6980565698 [addresses] Allowing null_phrase_probability for alpha, and alpha+digits instead of just for ordinals (mostly for Spain) 2016-07-21 17:04:57 -04:00
Al
d4d8fa81d1 [addresses] Adding increasing null_phrase_probability for plain numerics in Spain so things like 2o B make it into the training data 2016-07-21 17:04:57 -04:00
Al
35e73d0e40 [places] setting probability of including island to 0.5 for Hawaii, 0.8 seems too high given all the Honolulu, HI addresses (not often seen as Honolulu, Oahu, HI) 2016-07-21 17:04:57 -04:00
Al
605b7c2b4f [dictionaries] Italian CAP abbreviations 2016-07-21 17:04:57 -04:00
Al
4e8e08086e [dictionaries] Russian place names 2016-07-21 17:04:57 -04:00
Al
8d33b62da2 [dictionaries] Adding more fleshed out Greek dictionaries from a recent Nominatim NameFinder wiki update 2016-07-21 17:04:57 -04:00
Al
0d39cd94c2 [dictionaries] Refactoring existing unit_types/level_types dictionaries to use the new more granular dictionary structure 2016-07-21 17:04:57 -04:00
Al
11d1acc3bc [parser] Sample chain store alternate names from the cross-language dictionary 2016-07-21 17:04:57 -04:00
Al
69e1c846ba [parser] Fixing config keys so OSM streets/venues get abbreviated. Selecting namespaced address fields in cases like Brussels or Hong Kong where everything is bilingual. Adding the ability to pass a known language into address component expansion 2016-07-21 17:04:57 -04:00
Al
e5e0cf3b92 [fix] loading transliteration module in address_parser_test.c as well 2016-07-21 17:04:57 -04:00
Al
8e338c5ffb [fix] ON needs to be quotes in YAML, uppercase Yukon abbreviation 2016-07-21 17:04:57 -04:00
Al
b8d43dc601 [fix] cstring_array_split calls 2016-07-21 17:04:57 -04:00
Al
b19cd3f60a [fix] brace 2016-07-21 17:04:57 -04:00
Al
994b2f18e4 [parser] Ignore multiple spaces in parser input post-normalization. If normalizing the string creates several distinct tokens (namely in Vulgar fractions e.g. ½ => 1/2), add all the sub-tokens with the same label as the parent 2016-07-21 17:04:57 -04:00
Al
b664ab1cea [utils] Adding cstring_array_split_ignore_consecutive 2016-07-21 17:04:57 -04:00
Al
8e90ee45d2 [fix] calls and NULL checks 2016-07-21 17:04:57 -04:00
Al
e3cffaf0d1 [fix] tokenized_string_t should copy its source string 2016-07-21 17:04:57 -04:00
Al
16501aba17 [fix] Need to load transliteration module for Latin-ASCII normalization 2016-07-21 17:04:57 -04:00
Al
b326e209fb [places] Adding Town of to English prefixes 2016-07-21 17:04:57 -04:00
Al
366c4995af [parser] lower full-name probability for states 2016-07-21 17:04:57 -04:00
Al
d88be7ef5d [fix] use simple language code if language_script cannot be found 2016-07-21 17:04:57 -04:00
Al
90467e9098 [fix] global formatter config 2016-07-21 17:04:57 -04:00
Al
16a91528d6 [fix] config key name 2016-07-21 17:04:57 -04:00
Al
d3b936067e [fix] neighborhood reverse geocoder using the new OSM definitions module which keeps track of whatever the data fetching script defines as being a valid {neighborhood, admin boundary, etc.} 2016-07-21 17:04:57 -04:00
Al
b294b891dd [boundaries] lines sharing a point are added to the polygon head-to-tail, reversing the node order as needed, produces accurate OSM polygons for reverse geocoding lookups 2016-07-21 17:04:57 -04:00
Al
75aa713792 [fix] moving language code replacements out of address components 2016-07-21 17:04:57 -04:00
Al
6cb834b3a3 [boundaries] admin_level=8 is city_district in Japan 2016-07-21 17:04:57 -04:00
Al
308080f6ee [formatting] Moving language country overrides to formatter config so actual language is retained 2016-07-21 17:04:57 -04:00
Al
e59e3a173c [fix] place=municipality 2016-07-21 17:04:57 -04:00
Al
3c16973cac [fix] OSM neighborhood ids 2016-07-21 17:04:57 -04:00
Al
d86443a697 [fix] Adding basic Han numeral replacement to neighborhood deduping 2016-07-21 17:04:57 -04:00
Al
046f445a56 [fix] component bitsets 2016-07-21 17:04:57 -04:00
Al
0dbfd79b72 [fix] language format changes only apply to local languages 2016-07-21 17:04:57 -04:00
Al
12f86875e2 [formatting] Increase probability of postcode before city 2016-07-21 17:04:57 -04:00
Al
890268aa87 [languages] Use English formats for Romanized CJK 2016-07-21 17:04:57 -04:00
Al
ad4b197ead [fix] floor samples 2016-07-21 17:04:57 -04:00
Al
e53e61358d [fix] Don't remove chome from Japanese, as the neighborhoods are usually just plain numbers 2016-07-21 17:04:57 -04:00
Al
110be7a245 [fix] args 2016-07-21 17:04:57 -04:00
Al
9772e85c87 [fix] US/Canada probabilities for industrial/commercial 2016-07-21 17:04:57 -04:00
Al
d4e913c55f [boundaries] Adding CP and civil parish to English place suffixes 2016-07-21 17:04:57 -04:00
Al
a5331f7107 [osm] Venue name depends on one of {house_number, road, suburb, city_district, city, postcode} 2016-07-21 17:04:57 -04:00
Al
2d1e7ca990 [fix] Spanish office probabilities 2016-07-21 17:04:57 -04:00
Al
a1421d4a68 [fix] floors 2016-07-21 17:04:57 -04:00
Al
5ea570835e [fix] args again 2016-07-21 17:04:57 -04:00