Commit Graph

571 Commits

Author SHA1 Message Date
Al
b2b7f6f155 [osm] add wikipedia:* to rail station exception 2017-01-02 13:13:42 -05:00
Al
6163dbae39 [osm/places] adding option to only format place tags for city and smaller admins, using for polygons as larger polys should be included elsewhere anyway 2016-12-27 03:37:15 -05:00
Al
8abbb273b2 [osm] adding the excellent ftfy (https://github.com/LuminosoInsight/python-ftfy) to fix Mojibake, etc. in address components 2016-12-26 21:18:14 -05:00
Al
e31ab33fe0 [fix] kwarg 2016-12-25 03:19:41 -05:00
Al
a185441ffa [osm] adding amenity=post_office to the generic place types (shouldn't be added as venue unless there's a known phrase in the name) 2016-12-25 03:14:07 -05:00
Al
11dc8c9f24 [fix] non-dict keys in OSM boundary configs 2016-12-25 00:49:57 -05:00
Al
e4e86261d1 [addresses/JP] just remove addr:neighborhood, addr:quarter, etc. in Japan as they're not applied consistently outside of cities 2016-12-24 20:13:48 -05:00
Al
9928d249a6 [addresses/JP] combining addr:quarter and addr:neighbourhood in Japan (based on info in https://wiki.openstreetmap.org/wiki/JA:%E4%BD%8F%E6%89%80) 2016-12-24 19:54:54 -05:00
Al
32e7637037 [fix] handle case where addr:conscriptionnumber exists but not addr:housenumber 2016-12-21 01:54:07 -05:00
Al
3b14613f1d [fix] restore original house number for subsequent formatting after addr:conscriptionnumber/addr:streetnumber 2016-12-21 00:51:44 -05:00
Al
484c7ef912 [osm] adding addresses with addr:conscriptionnumber and addr:streetnumber when available 2016-12-21 00:36:40 -05:00
Al
f2720db2f8 [osm] adding simple street name normalization for certain streets in OSM that also contain the house number (only when separated by commas and in a country/language where house number comes after street). There are other cases for normalization but need to better define them. 2016-12-19 02:13:44 -05:00
Al
bf3e9749ca [osm] during place formatting, add point-based cities for any places/polygons that are smaller than cities e.g. suburb or city_district, use admin_center as the point for reverse geocoding if available (instead of representative_point() which can be expensive or centroid which can be inaccurate) 2016-12-12 05:29:39 -05:00
Al
bb12d0940e [fix] options/docs in osm address training 2016-12-10 13:45:37 -05:00
Al
5098599ed6 [addresses] remove Quattroshapes/GeoNames cities as they may have problematic names, and in any case we have point-based cities from OSM now 2016-12-10 02:08:40 -05:00
Al
8f30987bdf [fix] checking if building is a rail station 2016-12-09 02:57:47 -05:00
Al
da36b71829 [addresses] adding new places index in OSM and OpenAddresses training data 2016-12-05 18:36:17 -05:00
Al
e32c232c67 [localities] /planet-neighborhoods/planet-localities/ 2016-12-04 23:05:11 -05:00
Al
adab232674 [osm] don't include rail stations with no venue phrases (if there's a railway station at Foo, only include it if it's named "Foo Station", not just plain "Foo") 2016-12-01 02:03:38 -05:00
Al
cdbc102821 [boundaries] in addition to population, check if a city has an unambiguous Wikipedia 2016-11-25 13:36:49 -08:00
Al
5c3ccc3bc6 [places] better handling of population exceptions in places config 2016-11-25 00:38:49 -08:00
Al
ee8c070fd5 [osm] override admin_level with other components in config if present 2016-11-22 11:22:26 -05:00
Al
a596d03309 [fix] return values 2016-11-19 12:45:39 -05:00
Al
e15036fcce [fix] if there are street types that are not venue words and not vice versa, then call the venue invalid as a standalone term 2016-11-19 04:11:33 -05:00
Al
8e905fd17d [fix] if no venue names are passed in to formatted_addresses_with_venue_names, remove any existing venue name from the components as well 2016-11-19 03:46:16 -05:00
Al
e6fe576ec7 [fix] var 2016-11-19 03:15:23 -05:00
Al
1f50481cad [fix] args 2016-11-19 03:14:06 -05:00
Al
4d14f80f0c [osm] using the new gazetteer methods to do more thorough checks on single house names (if there are no other components than the standalone venue name, make sure it contains venue words like {library, bar}, etc. and not street type words like {road, street}, etc. so we don't get training examples that are simply "Abbey/house Road/house" with no house number or street name). If the venue name equals the street name or house number, drop it. Same if the venue name equals one of the admin components and no house number or street is present. If the venue name is numeric, require both a house number and a street name. 2016-11-19 03:12:24 -05:00
Al
8ef8d88186 [fix] don't short-circuit OSM address formatting unless there are no components and no venue names 2016-11-18 23:31:24 -05:00
Al
25ceeed6ef [fix] check before pop 2016-11-18 18:36:35 -05:00
Al
7a89c6e9ce [osm] removing dependencies for house/venue name (purely numeric names taken care of in osm formatter) 2016-11-18 18:32:44 -05:00
Al
00ebdfed7f [osm] adding alt_place_names to the shared formatting class AddressComponents and making them classmethods 2016-10-20 20:41:22 -04:00
Al
d9bc465c82 [osm] parsing out semicolon-delimited postal codes from OSM in countries like Poland that use hyphen delimited postcodes without treating them as number ranges 2016-10-19 17:46:42 -04:00
Al
ec77a247fa [fix] just ignore records without the "name" tag 2016-10-19 13:36:15 -04:00
Al
61078eded9 [fix] checking for dictionary key 2016-10-19 13:34:13 -04:00
Al
c2b73307de [fix] parens 2016-10-19 13:29:56 -04:00
Al
f639151698 [osm] checking for non-admin_center nodes which are part of a lower admin level polygon with the same name 2016-10-19 13:27:38 -04:00
Al
e380567ac4 [osm] adding alt_place_names method which does hyphenation, de-hyphenation and abbreviated toponyms with/without hyphens 2016-10-19 02:19:09 -04:00
Al
98ac232eea [osm] hyphenating and de-hyphenating place names in places training data 2016-10-19 00:33:10 -04:00
Al
7e007a49ab [osm] removing place=district mapping globally (means city_district in Hungary) and mapping it specifically to state_district/city_district in the places where it's needed 2016-10-18 19:02:36 -04:00
Al
d34faf42b8 [osm] fix names with pipes in them 2016-10-17 02:32:25 -04:00
Al
ff27ee14bb [osm] only add label props if the name property is identical (counterexample, Nottinghamshire's label is listed as West Bridgford, which is really its admin_center) 2016-10-16 22:18:52 -04:00
Al
6ff1024c02 [fix] null candidate languages 2016-10-07 19:49:32 -04:00
Al
169a3c3d70 [osm] drop postcode as well for address-only format 2016-10-07 01:10:16 -04:00
Al
0401a04adb [osm] add address-only formats (sans place tags) for every address as well to better deal handle incomplete queries where location is expected to be inferred by the geocoder, etc. 2016-10-07 00:59:52 -04:00
Al
a67efcffe4 [addresses] add new option to use city population to determine whether components should be dropped out 2016-10-05 18:16:25 -04:00
Al
66af532850 [osm] adding country-specific cleanups to OSM place training data 2016-10-05 17:13:13 -04:00
Al
2798420fdc [osm] add boundary=postal_district to admin borders for Ireland 2016-10-05 15:26:16 -04:00
Al
7b3a59878c [fix] bracket 2016-10-05 14:27:24 -04:00
Al
5744fc5a3c [fix] import 2016-10-05 03:23:34 -04:00