Commit Graph

180 Commits

Author SHA1 Message Date
Al
77efcb3f89 [fix] only accept language suffixes that are valid scripts or transliterations of CJK languages. Set language to language suffix so Romaji forms get used, etc. 2016-12-24 17:17:09 -05:00
Al
54b0af7f68 [addresses] add chome form for Japanese neighborhoods 2016-12-24 16:06:29 -05:00
Al
441ec00289 [openaddresses] using the new fuzzy equivalence comparison to check if suburb and city names are equal 2016-12-23 02:08:53 -05:00
Al
0814381d7f [fix] dehyphenate multiword names before city/suburb comparison 2016-12-23 01:53:09 -05:00
Al
70b98c877d [fix] except when None 2016-12-22 23:25:20 -05:00
Al
00f3f3f94d [fix] now that neighborhood is classified at index construction time, no longer need to assume suburb for components that might otherwise be a city, etc. 2016-12-22 23:21:08 -05:00
Al
481bc248a1 [fix] make city more likely, eliminate admin components from the set if they don't have names 2016-12-22 22:56:52 -05:00
Al
5df9dd9810 [fix] popping city name component 2016-12-22 21:40:09 -05:00
Al
46f421e455 [fix] names 2016-12-22 20:56:58 -05:00
Al
9db3c8ee4a [fix] name 2016-12-22 20:15:21 -05:00
Al
7d5512a82f [neighborhoods] adding option for suburb/city_district to replace city when the user-specified city is actually a neighborhood/district 2016-12-22 15:15:26 -05:00
Al
9afc910ce3 [fix] var 2016-12-22 03:46:36 -05:00
Al
665a91ccea [fix] stray paren 2016-12-22 03:44:59 -05:00
Al
3e687f0ef0 [fix] parens 2016-12-22 03:38:56 -05:00
Al
242a5281cc [osm] throwing away street names that are None/NULL, and those that only contain punctuation 2016-12-22 03:36:30 -05:00
Al
c6683e3237 [addresses] check that user-specified boundary names have at least one word token (OSM can have addr:city="?" and other weirdness. Also only right-stripping hyphens from house number in case of negative numbers 2016-12-22 02:03:01 -05:00
Al
6388a79bf0 [addresses] strip "-", etc. in addr:housenumber 2016-12-21 01:53:23 -05:00
Al
c33db4f04d [addresses] normalize existing sub-building components 2016-12-21 01:28:43 -05:00
Al
cc4098fb05 [openaddresses] abbreviate states as well in OpenAddresses when full version is specified 2016-12-20 17:24:12 -05:00
Al
9e44fcb2bb [addresses] abbreviating neighborhoods/city_districts 2016-12-20 03:01:34 -05:00
Al
53723bbf3d [fix] passing argument through to normalized_place_name 2016-12-20 02:21:38 -05:00
Al
6d02fbb9b8 [addresses] switch for phrases that come from components so they only get stripped if they contain another phrase a la Washington, D.C. Consolidating always_use_full_names and random_key options 2016-12-20 01:42:40 -05:00
Al
f35fd97735 [boundaries] add abbreviated state names to valid component names 2016-12-19 00:51:05 -05:00
Al
d02a18a5a8 [fix] all_names, use values instead of name keys 2016-12-18 17:29:15 -05:00
Al
e9c7bc43e3 [fix] check fixed list of keys in all_names as well 2016-12-18 17:26:43 -05:00
Al
2727572822 [addresses] using the name key disttribution in AddressComponents.all_names. Returning names and valid components from the new function instead of the full gazetteer (can be build later) 2016-12-18 17:22:13 -05:00
Al
d308473686 [addresses] separating boundary phrase gazetteer construction into its own method 2016-12-18 15:47:20 -05:00
Al
846b88cde5 [addresses] let the place config take care of adding/removing neighborhoods rather than doing it as part of the add_neighborhoods method 2016-12-14 03:15:07 -05:00
Al
5946ead37f [addresses] using the defined component from the neighborhoods index for city_district (they're fairly rare, just NYC boroughs basically) 2016-12-14 03:10:07 -05:00
Al
5846943b70 [addresses] removing place_type override requirement from the neighborhoods index (NYC boroughs, etc.) 2016-12-14 02:16:57 -05:00
Al
40cd86c3be [addresses] only add city relacement if a city is not found first 2016-12-13 16:10:52 -05:00
Al
d158751d92 [addresses] same rules for state_district apply to state, no alt_names etc. unless a city is present 2016-12-12 05:31:32 -05:00
Al
da4fe37fb4 [addresses] option to add city points, no random keys for state_district if city or replacement is not present 2016-12-11 16:24:16 -05:00
Al
dfc88a47b2 [fix] typo 2016-12-11 02:46:03 -05:00
Al
e8abf44c16 [neighborhoods] check if there's no defined place-type before classifying a polygon as city_district 2016-12-11 02:44:02 -05:00
Al
fffc81a17a [fix] default value 2016-12-10 18:14:25 -05:00
Al
91982528c6 [fix] normalize place names after adding admin boundaries as well 2016-12-10 18:07:41 -05:00
Al
34d3ae7e9e [addresses] fixing normalized_place_name so it deals with things like Washington DC where Washington DC may actually be one of the OSM names 2016-12-10 17:52:38 -05:00
Al
4550f00f03 [fix] var name 2016-12-10 15:18:09 -05:00
Al
72771741c3 [fix] order 2016-12-10 15:16:35 -05:00
Al
8595d8da05 [addresses] don't add components to the trie that have the same normalized name as the given component 2016-12-10 15:12:40 -05:00
Al
ffc584f679 [states] adding all forms of the state abbreviation to the trie when doing place name normalization to handle the D.C./DC case 2016-12-10 13:45:22 -05:00
Al
5098599ed6 [addresses] remove Quattroshapes/GeoNames cities as they may have problematic names, and in any case we have point-based cities from OSM now 2016-12-10 02:08:40 -05:00
Al
18c5fd0855 [fix] check for non-None city 2016-12-10 01:23:06 -05:00
Al
dc022f8652 [osm] adding normalized_place_name to Quattroshapes city 2016-12-10 01:20:40 -05:00
Al
c7b1818695 [fix] imports 2016-12-09 19:53:17 -05:00
Al
675552d254 [addresses] using normalized tokens when stripping off compound place names for things like D.C. 2016-12-09 17:52:57 -05:00
Al
f9945103ba [addresses] if suburb/city_district is already listed, and we're finding the closest city by point rather than by boundary, use the closest actual city, not something smaller like a village/hamlet 2016-12-08 02:39:27 -05:00
Al
7436d9693a [names] adding new name_affixes call to replace both prefixes/suffixes in one call, using in GeoPlanet training and the generic AddressComponents normalizations 2016-12-07 05:49:16 -05:00
Al
e13787a6f6 [fix] var name again 2016-12-05 18:49:23 -05:00