Commit Graph

166 Commits

Author SHA1 Message Date
Al
8e905fd17d [fix] if no venue names are passed in to formatted_addresses_with_venue_names, remove any existing venue name from the components as well 2016-11-19 03:46:16 -05:00
Al
e6fe576ec7 [fix] var 2016-11-19 03:15:23 -05:00
Al
1f50481cad [fix] args 2016-11-19 03:14:06 -05:00
Al
4d14f80f0c [osm] using the new gazetteer methods to do more thorough checks on single house names (if there are no other components than the standalone venue name, make sure it contains venue words like {library, bar}, etc. and not street type words like {road, street}, etc. so we don't get training examples that are simply "Abbey/house Road/house" with no house number or street name). If the venue name equals the street name or house number, drop it. Same if the venue name equals one of the admin components and no house number or street is present. If the venue name is numeric, require both a house number and a street name. 2016-11-19 03:12:24 -05:00
Al
8ef8d88186 [fix] don't short-circuit OSM address formatting unless there are no components and no venue names 2016-11-18 23:31:24 -05:00
Al
25ceeed6ef [fix] check before pop 2016-11-18 18:36:35 -05:00
Al
7a89c6e9ce [osm] removing dependencies for house/venue name (purely numeric names taken care of in osm formatter) 2016-11-18 18:32:44 -05:00
Al
00ebdfed7f [osm] adding alt_place_names to the shared formatting class AddressComponents and making them classmethods 2016-10-20 20:41:22 -04:00
Al
d9bc465c82 [osm] parsing out semicolon-delimited postal codes from OSM in countries like Poland that use hyphen delimited postcodes without treating them as number ranges 2016-10-19 17:46:42 -04:00
Al
ec77a247fa [fix] just ignore records without the "name" tag 2016-10-19 13:36:15 -04:00
Al
61078eded9 [fix] checking for dictionary key 2016-10-19 13:34:13 -04:00
Al
c2b73307de [fix] parens 2016-10-19 13:29:56 -04:00
Al
f639151698 [osm] checking for non-admin_center nodes which are part of a lower admin level polygon with the same name 2016-10-19 13:27:38 -04:00
Al
e380567ac4 [osm] adding alt_place_names method which does hyphenation, de-hyphenation and abbreviated toponyms with/without hyphens 2016-10-19 02:19:09 -04:00
Al
98ac232eea [osm] hyphenating and de-hyphenating place names in places training data 2016-10-19 00:33:10 -04:00
Al
d34faf42b8 [osm] fix names with pipes in them 2016-10-17 02:32:25 -04:00
Al
6ff1024c02 [fix] null candidate languages 2016-10-07 19:49:32 -04:00
Al
169a3c3d70 [osm] drop postcode as well for address-only format 2016-10-07 01:10:16 -04:00
Al
0401a04adb [osm] add address-only formats (sans place tags) for every address as well to better deal handle incomplete queries where location is expected to be inferred by the geocoder, etc. 2016-10-07 00:59:52 -04:00
Al
a67efcffe4 [addresses] add new option to use city population to determine whether components should be dropped out 2016-10-05 18:16:25 -04:00
Al
66af532850 [osm] adding country-specific cleanups to OSM place training data 2016-10-05 17:13:13 -04:00
Al
faf418decb [languages] using country_and_languages method in OSM, neighborhoods and OpenAddresses 2016-10-05 02:49:55 -04:00
Al
85ae5d4a05 [fix] name 2016-08-19 23:38:33 -04:00
Al
7951044d74 [intersections] Abbreviating street names that are not base names with random probabilities 2016-08-19 23:27:29 -04:00
Al
42808c62e3 [fix] dictionary access 2016-08-19 16:02:36 -04:00
Al
41f715d6ee [intersections] Better handling of default languages in intersection queries 2016-08-19 15:59:58 -04:00
Al
a7118b40a7 [intersections] Allowing tags like name_1, etc. to make it into road name permutations for intersections 2016-08-19 13:12:02 -04:00
Al
0b2d3d965f [fix] using lat/lon from the node properties in intersections data 2016-08-19 12:23:08 -04:00
Al
688f103e80 [fix] languages 2016-08-18 02:24:34 -04:00
Al
e3ac3200b3 [fix] disambiguating languages using one of the default street names in intersections data 2016-08-18 02:05:13 -04:00
Al
328398813a [fix] itertools.combinations 2016-08-18 01:26:48 -04:00
Al
737cbf4457 [fix] reference before assignment 2016-08-18 01:24:30 -04:00
Al
b41ba7374b [intersections] intersections training data, using a Cartesian product of all names in the same language, including something like tiger:name_base 2016-08-18 01:19:14 -04:00
Al
701bcb1d79 [intersections] Using name cleanup on intersections, including tiger:name_base which sometimes has semicolon delimiters as well 2016-08-17 18:47:07 -04:00
Al
145af9331e [osm] build OSM training data for intersections using the JSON output from intersections.py rather having to compute each time 2016-08-17 18:11:55 -04:00
Al
7ff0cb2704 [fix] name and a few things for intersections data 2016-08-15 21:26:54 -04:00
Al
7ab6af4335 [fix] bounds 2016-08-15 12:01:22 -04:00
Al
060d3a1f86 [fix] var name 2016-08-15 11:18:00 -04:00
Al
29fc198aba [osm] giving parse_osm_number_range a parameter for max range and setting it to 1000 for postal codes e.g. for major cities that may have several hundred postal codes 2016-08-15 10:34:24 -04:00
Al
637baad629 [osm] Adding at least min_references entries for every selected postcode 2016-08-15 10:30:28 -04:00
Al
aa6b9cd858 [fix] var name for place tags coming from the admin rtree 2016-08-15 10:25:19 -04:00
Al
bc8acb196c [osm] Pulling valid postal codes out into a method 2016-08-13 01:49:26 -04:00
Al
b993e9a163 [fix] add Japanese-language variant if metro station is added 2016-08-06 21:17:14 -04:00
Al
39bd562d04 [addresses] only set language if we needed it for Japanese house_numbers 2016-08-06 21:06:01 -04:00
Al
e68fee7c68 [fix] null check 2016-08-06 20:39:28 -04:00
Al
374c46ada5 [fix] metro station properties 2016-08-06 19:56:13 -04:00
Al
195278cfea [osm] Reverse geocoding to metro station only for addresess in Japan 2016-08-06 19:50:18 -04:00
Al
964728a02d [fix] block phrases for Japanese and namespaced language handling in case Romaji is chosen before normalization 2016-08-06 14:50:39 -04:00
Al
445e8082c8 [addresses] Adding per-country overrides for address component dependencies 2016-08-06 02:36:47 -04:00
Al
813f29f299 [osm] Removing the call to normalize_place_names in place data formatting as we should be able to trust the places more than the addresses 2016-08-02 16:29:34 -04:00