Commit Graph

548 Commits

Author SHA1 Message Date
Al
e15036fcce [fix] if there are street types that are not venue words and not vice versa, then call the venue invalid as a standalone term 2016-11-19 04:11:33 -05:00
Al
8e905fd17d [fix] if no venue names are passed in to formatted_addresses_with_venue_names, remove any existing venue name from the components as well 2016-11-19 03:46:16 -05:00
Al
e6fe576ec7 [fix] var 2016-11-19 03:15:23 -05:00
Al
1f50481cad [fix] args 2016-11-19 03:14:06 -05:00
Al
4d14f80f0c [osm] using the new gazetteer methods to do more thorough checks on single house names (if there are no other components than the standalone venue name, make sure it contains venue words like {library, bar}, etc. and not street type words like {road, street}, etc. so we don't get training examples that are simply "Abbey/house Road/house" with no house number or street name). If the venue name equals the street name or house number, drop it. Same if the venue name equals one of the admin components and no house number or street is present. If the venue name is numeric, require both a house number and a street name. 2016-11-19 03:12:24 -05:00
Al
8ef8d88186 [fix] don't short-circuit OSM address formatting unless there are no components and no venue names 2016-11-18 23:31:24 -05:00
Al
25ceeed6ef [fix] check before pop 2016-11-18 18:36:35 -05:00
Al
7a89c6e9ce [osm] removing dependencies for house/venue name (purely numeric names taken care of in osm formatter) 2016-11-18 18:32:44 -05:00
Al
00ebdfed7f [osm] adding alt_place_names to the shared formatting class AddressComponents and making them classmethods 2016-10-20 20:41:22 -04:00
Al
d9bc465c82 [osm] parsing out semicolon-delimited postal codes from OSM in countries like Poland that use hyphen delimited postcodes without treating them as number ranges 2016-10-19 17:46:42 -04:00
Al
ec77a247fa [fix] just ignore records without the "name" tag 2016-10-19 13:36:15 -04:00
Al
61078eded9 [fix] checking for dictionary key 2016-10-19 13:34:13 -04:00
Al
c2b73307de [fix] parens 2016-10-19 13:29:56 -04:00
Al
f639151698 [osm] checking for non-admin_center nodes which are part of a lower admin level polygon with the same name 2016-10-19 13:27:38 -04:00
Al
e380567ac4 [osm] adding alt_place_names method which does hyphenation, de-hyphenation and abbreviated toponyms with/without hyphens 2016-10-19 02:19:09 -04:00
Al
98ac232eea [osm] hyphenating and de-hyphenating place names in places training data 2016-10-19 00:33:10 -04:00
Al
7e007a49ab [osm] removing place=district mapping globally (means city_district in Hungary) and mapping it specifically to state_district/city_district in the places where it's needed 2016-10-18 19:02:36 -04:00
Al
d34faf42b8 [osm] fix names with pipes in them 2016-10-17 02:32:25 -04:00
Al
ff27ee14bb [osm] only add label props if the name property is identical (counterexample, Nottinghamshire's label is listed as West Bridgford, which is really its admin_center) 2016-10-16 22:18:52 -04:00
Al
6ff1024c02 [fix] null candidate languages 2016-10-07 19:49:32 -04:00
Al
169a3c3d70 [osm] drop postcode as well for address-only format 2016-10-07 01:10:16 -04:00
Al
0401a04adb [osm] add address-only formats (sans place tags) for every address as well to better deal handle incomplete queries where location is expected to be inferred by the geocoder, etc. 2016-10-07 00:59:52 -04:00
Al
a67efcffe4 [addresses] add new option to use city population to determine whether components should be dropped out 2016-10-05 18:16:25 -04:00
Al
66af532850 [osm] adding country-specific cleanups to OSM place training data 2016-10-05 17:13:13 -04:00
Al
2798420fdc [osm] add boundary=postal_district to admin borders for Ireland 2016-10-05 15:26:16 -04:00
Al
7b3a59878c [fix] bracket 2016-10-05 14:27:24 -04:00
Al
5744fc5a3c [fix] import 2016-10-05 03:23:34 -04:00
Al
70a5ded45c [fix] encode element id 2016-10-05 03:14:19 -04:00
Al
432f9dd42e [fix] format of candidate_languages in the new OSM rtree 2016-10-05 03:12:07 -04:00
Al
faf418decb [languages] using country_and_languages method in OSM, neighborhoods and OpenAddresses 2016-10-05 02:49:55 -04:00
Al
b1cd7fdc4a [osm] adding type/id to properties dict earlier in the pipeline 2016-10-05 00:53:32 -04:00
Al
6081df0cd1 [osm] adding admin1 ids to the OSM country rtree 2016-10-04 23:12:15 -04:00
Al
aea67c0769 [osm] adding admin1 exceptions to the country polygons 2016-10-04 18:33:52 -04:00
Al
5d7405b2fd [osm] country and postal code polygon readers 2016-10-01 01:11:35 -04:00
Al
c77e36deab [osm] Prevent user-defined lat/lon keys from overriding the lat/lon on the node 2016-10-01 00:38:13 -04:00
Al
cd9fe4eb7b [boundaries] Adding option to still check for global overrides but only if nothing else was found using admin_level, etc. Updating South Korea and adding this option to Luxembourg. 2016-09-24 15:36:03 -04:00
Al
d66ea835b1 [fix] allowing latitude 90 for validation purposes (North Pole) 2016-09-23 01:28:13 -04:00
Al
ca5bcba85e [osm] set -e so script errors out if anything fails and add --quiet to wget for I/O redirection purposes 2016-09-22 00:52:00 -04:00
Al
764a74fae4 [osm] overwrite downloaded files 2016-09-19 03:21:37 -04:00
Al
c3afcdfce5 [osm] expanding criteria for the buildings data set (buildlings with addr:housenumber, addr:housename, addr:street, or addr:postcode are useful) 2016-09-19 03:15:07 -04:00
Al
cb3fe5273a [components] using gnis:class=Populated Place to map to city for the US when admin_level is not specified and the place key is not specified/not mapped 2016-09-07 11:49:44 -04:00
Al
ec24c4b6ac [boundaries] removing the place=borough mapping because it's used on the east coast and PA to mean city 2016-09-07 11:24:37 -04:00
Al
db8f5b717c [boundaries] adding use_admin_center to boundary configs right alongside other overrides 2016-09-02 02:00:18 -04:00
Al
e4e35d0593 [osm] adding no_global_overrides option for boundary configs 2016-08-30 12:44:24 -04:00
Al
da619e3cf4 [osm] Adding border_type=city to override tags 2016-08-25 15:21:33 -04:00
Al
d281e71d2c [fix] removing metro station indexas a dependency for AddressComponents 2016-08-22 15:52:27 -04:00
Al
85ae5d4a05 [fix] name 2016-08-19 23:38:33 -04:00
Al
7951044d74 [intersections] Abbreviating street names that are not base names with random probabilities 2016-08-19 23:27:29 -04:00
Al
42808c62e3 [fix] dictionary access 2016-08-19 16:02:36 -04:00
Al
41f715d6ee [intersections] Better handling of default languages in intersection queries 2016-08-19 15:59:58 -04:00