618 Commits

Author SHA1 Message Date
Al
fc91471434 [osm/boundaries] check polygons with an ISO3166-2 as well in the country polygon index in case the country polygon is funky 2017-04-09 02:15:46 -04:00
Al
5d73aa1295 [fix] don't write formatted addresses in the ways-only data set unless the formatter returns non-None value 2017-03-05 03:50:00 -05:00
Al
be6f48f109 [fix] that didn't work, set log level to CRITICAL 2017-02-15 14:06:57 -05:00
Al
26bf617a06 [fix] prevent Shapely from logging to console 2017-02-15 14:00:51 -05:00
Al
6d580f4c87 [osm] neighborhood polygon reader 2017-02-14 01:50:04 -05:00
Al
2003e08623 [osm] creating an OSM neighborhood boundaries data set for place=neighbourhood polygons only (place=suburb, etc. can be ambiguous) 2017-02-13 20:45:54 -05:00
Al
e569956944 [osm] remove postcode field if more than one is found 2017-02-11 03:52:46 -05:00
Al
2dff6c8839 [fix] call 2017-02-11 02:16:55 -05:00
Al
7bfc52b540 [osm] add postcode phrases when there's no validation/component-stripping 2017-02-11 01:54:55 -05:00
Al
f6bea5ebe5 [fix] always validate in comma-separated postcodes 2017-02-11 01:42:19 -05:00
Al
3d9b512cda [fix] pop 2017-02-11 01:28:56 -05:00
Al
e5a98d16d8 [fix] args 2017-02-11 01:15:07 -05:00
Al
01c4c8ec82 [fix] scope 2017-02-11 01:07:20 -05:00
Al
ffc12ec5ab [osm] add new method in OSM formatting to extract one or more expanded postal codes from an addr:postcode tag, using the new country-specific rules 2017-02-11 00:53:52 -05:00
Al
4e1d7d9373 [osm] use new postal codes module in OSM formatting 2017-02-10 23:56:23 -05:00
Al
293587bae9 [addresses] adding new config for postal codes around the world. Allows appending the ISO alpha-2 country code to the beginning of the postcode as in e.g. SI-1000 (only used if the postcode begins with a digit). This system was used for postal codes in continental Europe as a recommendation from the CEPT. Now 7 member states still use it, so in those countries add the country-code with higher probability. The config also contains the license plate codes for countries where e.g. L-1234 might be used instead of LU-1234. Allows configuring in which countries postcodes should be validated using Google's per-country validation regexes (and the ability to override with a custom regex), and in which countries other admin component names should be stripped. 2017-02-10 23:53:50 -05:00
Al
7a360f4211 [osm] addr:postcode can be all over the place in OSM. Start with postcodes containing commas or semicolons. If addr:postcode (on address of building) contains either, iterate over the values and pick the first one that matches a postcode validation regex for that country 2017-02-08 16:13:29 -05:00
Al
01d6d47b08 [osm] removing addr:place mapping to road as it's usually a village in post-Soviet states, etc. Can handle it down the road 2017-01-27 13:54:08 -05:00
Al
11345bf2bf [osm] using new constants in OSM formatting as well 2017-01-27 13:53:00 -05:00
Al
bc748b6d62 [addresses] supplying country arg when stripping name affixes both for OSM place-based data sets (ways, localities) and OpenAddresses (shouldn't affect any of the countries currently in OA though) 2017-01-23 23:30:33 -05:00
Al
a931c5ddc9 [osm] checking for valid street names in OSM street-only training data so e.g. the street name is not just a simple number like "831" 2017-01-19 02:34:29 -05:00
Al
05568194aa [fix] var initialization II 2017-01-18 01:54:18 -05:00
Al
b19ab0ae48 [fix] var initialization 2017-01-18 01:48:02 -05:00
Al
d498fa893c [fix] name 2017-01-16 22:15:25 -05:00
Al
8566cb4054 [addresses] refactoring place component cleanup into a method that can be reused with the place and ways training data 2017-01-16 20:43:55 -05:00
Al
35dbce59d2 [osm] base case for default_language, applying the ways/relations requirement again as the nodes are mostly motorway_junction and can often be just a city name, etc. 2017-01-16 19:10:27 -05:00
Al
96a98fc63c [fix] var name II 2017-01-16 18:57:29 -05:00
Al
582d042e95 [fix] var name 2017-01-16 18:56:20 -05:00
Al
b28728b017 [fix] tuple 2017-01-16 18:53:40 -05:00
Al
42b0a4cf68 [fix] var name 2017-01-16 18:46:08 -05:00
Al
4902e88b81 [fix] formatted OSM ways training data should use nodes as well as ways/relations 2017-01-16 18:39:53 -05:00
Al
449154d624 [fix] arg 2017-01-16 15:34:38 -05:00
Al
be763539d3 [fix] remove var 2017-01-16 15:31:26 -05:00
Al
8c92013c43 [fix] args to way_names 2017-01-16 15:29:16 -05:00
Al
934f6247c6 [osm] options to build the streets-only training data 2017-01-16 15:26:04 -05:00
Al
a0150f37d0 [osm] better lat/lon conversion for admin_center point 2017-01-14 17:48:37 -05:00
Al
c7e644ca51 [fix] validating number ranges in extract_valid_postcodes as well 2017-01-12 14:09:33 -05:00
Al
59ed268558 [osm] require name tag for formatted places 2017-01-12 13:00:07 -05:00
Al
b90d88db3e [fix] import 2017-01-12 12:08:40 -05:00
Al
ba0f097d78 [boundaries] adding check for valid name key in formatted places, and removing short_name from the Sao Paulo relation as well 2017-01-12 12:05:42 -05:00
Al
122d7b2b79 [fix] only using the revised address components for CLDR country name 2017-01-12 02:33:16 -05:00
Al
88a80f4e30 [fix] using normalized tags throughout in OSM formatted place data 2017-01-12 02:25:17 -05:00
Al
bec569adaa [osm] adding new validity check to venue names so if the Jaccard(name tokens, street & house numer tokens) == 1 and the address does not have a known venue type e.g. a restaurant, the "venue name" is actually just the street address and can be discarded 2017-01-11 16:23:42 -05:00
Al
7f851810d2 [addresses] formatting addresses in Brasilia, so e.g. "Bloco B" is never part of the street name or building name, it's the house number. place=neighbourhood maps to nothing in Brasilia as these are basically subdivisions whose streets are identically named 2017-01-11 16:18:04 -05:00
Al
0d030a98c5 [osm] adding airport polygon index 2017-01-11 04:25:54 -05:00
Al
979fd16215 [osm] adding airports and terminals data sets with points and polygons, more file cleanup in OSM fetch script 2017-01-10 16:20:32 -05:00
Al
828b67d4f7 [osm] adding some new training data for simple road names and their surrounding admin boundaries 2017-01-07 15:34:43 -05:00
Al
b2b7f6f155 [osm] add wikipedia:* to rail station exception 2017-01-02 13:13:42 -05:00
Al
6163dbae39 [osm/places] adding option to only format place tags for city and smaller admins, using for polygons as larger polys should be included elsewhere anyway 2016-12-27 03:37:15 -05:00
Al
8abbb273b2 [osm] adding the excellent ftfy (https://github.com/LuminosoInsight/python-ftfy) to fix Mojibake, etc. in address components 2016-12-26 21:18:14 -05:00