Commit Graph

601 Commits

Author SHA1 Message Date
Al
01d6d47b08 [osm] removing addr:place mapping to road as it's usually a village in post-Soviet states, etc. Can handle it down the road 2017-01-27 13:54:08 -05:00
Al
11345bf2bf [osm] using new constants in OSM formatting as well 2017-01-27 13:53:00 -05:00
Al
bc748b6d62 [addresses] supplying country arg when stripping name affixes both for OSM place-based data sets (ways, localities) and OpenAddresses (shouldn't affect any of the countries currently in OA though) 2017-01-23 23:30:33 -05:00
Al
a931c5ddc9 [osm] checking for valid street names in OSM street-only training data so e.g. the street name is not just a simple number like "831" 2017-01-19 02:34:29 -05:00
Al
05568194aa [fix] var initialization II 2017-01-18 01:54:18 -05:00
Al
b19ab0ae48 [fix] var initialization 2017-01-18 01:48:02 -05:00
Al
d498fa893c [fix] name 2017-01-16 22:15:25 -05:00
Al
8566cb4054 [addresses] refactoring place component cleanup into a method that can be reused with the place and ways training data 2017-01-16 20:43:55 -05:00
Al
35dbce59d2 [osm] base case for default_language, applying the ways/relations requirement again as the nodes are mostly motorway_junction and can often be just a city name, etc. 2017-01-16 19:10:27 -05:00
Al
96a98fc63c [fix] var name II 2017-01-16 18:57:29 -05:00
Al
582d042e95 [fix] var name 2017-01-16 18:56:20 -05:00
Al
b28728b017 [fix] tuple 2017-01-16 18:53:40 -05:00
Al
42b0a4cf68 [fix] var name 2017-01-16 18:46:08 -05:00
Al
4902e88b81 [fix] formatted OSM ways training data should use nodes as well as ways/relations 2017-01-16 18:39:53 -05:00
Al
449154d624 [fix] arg 2017-01-16 15:34:38 -05:00
Al
be763539d3 [fix] remove var 2017-01-16 15:31:26 -05:00
Al
8c92013c43 [fix] args to way_names 2017-01-16 15:29:16 -05:00
Al
934f6247c6 [osm] options to build the streets-only training data 2017-01-16 15:26:04 -05:00
Al
a0150f37d0 [osm] better lat/lon conversion for admin_center point 2017-01-14 17:48:37 -05:00
Al
c7e644ca51 [fix] validating number ranges in extract_valid_postcodes as well 2017-01-12 14:09:33 -05:00
Al
59ed268558 [osm] require name tag for formatted places 2017-01-12 13:00:07 -05:00
Al
b90d88db3e [fix] import 2017-01-12 12:08:40 -05:00
Al
ba0f097d78 [boundaries] adding check for valid name key in formatted places, and removing short_name from the Sao Paulo relation as well 2017-01-12 12:05:42 -05:00
Al
122d7b2b79 [fix] only using the revised address components for CLDR country name 2017-01-12 02:33:16 -05:00
Al
88a80f4e30 [fix] using normalized tags throughout in OSM formatted place data 2017-01-12 02:25:17 -05:00
Al
bec569adaa [osm] adding new validity check to venue names so if the Jaccard(name tokens, street & house numer tokens) == 1 and the address does not have a known venue type e.g. a restaurant, the "venue name" is actually just the street address and can be discarded 2017-01-11 16:23:42 -05:00
Al
7f851810d2 [addresses] formatting addresses in Brasilia, so e.g. "Bloco B" is never part of the street name or building name, it's the house number. place=neighbourhood maps to nothing in Brasilia as these are basically subdivisions whose streets are identically named 2017-01-11 16:18:04 -05:00
Al
0d030a98c5 [osm] adding airport polygon index 2017-01-11 04:25:54 -05:00
Al
979fd16215 [osm] adding airports and terminals data sets with points and polygons, more file cleanup in OSM fetch script 2017-01-10 16:20:32 -05:00
Al
828b67d4f7 [osm] adding some new training data for simple road names and their surrounding admin boundaries 2017-01-07 15:34:43 -05:00
Al
b2b7f6f155 [osm] add wikipedia:* to rail station exception 2017-01-02 13:13:42 -05:00
Al
6163dbae39 [osm/places] adding option to only format place tags for city and smaller admins, using for polygons as larger polys should be included elsewhere anyway 2016-12-27 03:37:15 -05:00
Al
8abbb273b2 [osm] adding the excellent ftfy (https://github.com/LuminosoInsight/python-ftfy) to fix Mojibake, etc. in address components 2016-12-26 21:18:14 -05:00
Al
e31ab33fe0 [fix] kwarg 2016-12-25 03:19:41 -05:00
Al
a185441ffa [osm] adding amenity=post_office to the generic place types (shouldn't be added as venue unless there's a known phrase in the name) 2016-12-25 03:14:07 -05:00
Al
11dc8c9f24 [fix] non-dict keys in OSM boundary configs 2016-12-25 00:49:57 -05:00
Al
e4e86261d1 [addresses/JP] just remove addr:neighborhood, addr:quarter, etc. in Japan as they're not applied consistently outside of cities 2016-12-24 20:13:48 -05:00
Al
9928d249a6 [addresses/JP] combining addr:quarter and addr:neighbourhood in Japan (based on info in https://wiki.openstreetmap.org/wiki/JA:%E4%BD%8F%E6%89%80) 2016-12-24 19:54:54 -05:00
Al
32e7637037 [fix] handle case where addr:conscriptionnumber exists but not addr:housenumber 2016-12-21 01:54:07 -05:00
Al
3b14613f1d [fix] restore original house number for subsequent formatting after addr:conscriptionnumber/addr:streetnumber 2016-12-21 00:51:44 -05:00
Al
484c7ef912 [osm] adding addresses with addr:conscriptionnumber and addr:streetnumber when available 2016-12-21 00:36:40 -05:00
Al
f2720db2f8 [osm] adding simple street name normalization for certain streets in OSM that also contain the house number (only when separated by commas and in a country/language where house number comes after street). There are other cases for normalization but need to better define them. 2016-12-19 02:13:44 -05:00
Al
bf3e9749ca [osm] during place formatting, add point-based cities for any places/polygons that are smaller than cities e.g. suburb or city_district, use admin_center as the point for reverse geocoding if available (instead of representative_point() which can be expensive or centroid which can be inaccurate) 2016-12-12 05:29:39 -05:00
Al
bb12d0940e [fix] options/docs in osm address training 2016-12-10 13:45:37 -05:00
Al
5098599ed6 [addresses] remove Quattroshapes/GeoNames cities as they may have problematic names, and in any case we have point-based cities from OSM now 2016-12-10 02:08:40 -05:00
Al
8f30987bdf [fix] checking if building is a rail station 2016-12-09 02:57:47 -05:00
Al
da36b71829 [addresses] adding new places index in OSM and OpenAddresses training data 2016-12-05 18:36:17 -05:00
Al
e32c232c67 [localities] /planet-neighborhoods/planet-localities/ 2016-12-04 23:05:11 -05:00
Al
adab232674 [osm] don't include rail stations with no venue phrases (if there's a railway station at Foo, only include it if it's named "Foo Station", not just plain "Foo") 2016-12-01 02:03:38 -05:00
Al
cdbc102821 [boundaries] in addition to population, check if a city has an unambiguous Wikipedia 2016-11-25 13:36:49 -08:00