Commit Graph

3978 Commits

Author SHA1 Message Date
Al
e48f207d10 [openaddresses] updating with new OpenAddresses sources 2016-10-27 11:19:30 -04:00
Al
5cabd9b4f7 [fix] country languages in OpenAddresses 2016-10-24 17:35:39 -04:00
Al
ac0eb1776e [openaddresses] adding Brazoria County, TX 2016-10-24 09:27:11 -04:00
Al
35d3d8cc73 [openaddresses] countries are known a priori, so if the boundaries don't quite line up with OSM, use the country from the path 2016-10-23 19:50:54 -04:00
Al
f429bea15b [fix] subtract abs value 2016-10-23 01:11:09 -04:00
Al
1658c425c5 [fix] clear country cache only at each new country, not each file 2016-10-23 00:57:52 -04:00
Al
7199ff17e0 [fix] truncate postcodes that are longer than specified length 2016-10-23 00:52:24 -04:00
Al
3934111cdf [openaddresses] 5-digit postcodes in a few counties 2016-10-23 00:51:43 -04:00
Al
889e914dfc [openaddresses] clear all polygon caches 2016-10-23 00:11:54 -04:00
Al
0fd431a9d2 [fix] abs 2016-10-22 23:55:30 -04:00
Al
ec54d3de35 [fix] don't convert number to int/float in numeric_phrase (chops leading zeros) 2016-10-22 23:49:58 -04:00
Al
63edd53fb3 [openaddresses] adding clear_cache method to clear the LRU cache for point-in-polygon indices and using it in OpenAddresses import since it heavily reuses polygons and only for the current file 2016-10-22 20:28:59 -04:00
Al
d51a1d6196 [addresses] doing hyphenation for existing components in component expansion (i.e. OSM training data) 2016-10-21 22:02:19 -04:00
Al
0216a991c6 [formatting] use US template insertions for Canada as well 2016-10-21 14:43:40 -04:00
Al
2a355b2cf8 [openaddresses] adding address only 10% of the time in OpenAddresses 2016-10-20 23:57:30 -04:00
Al
dfbc4bf144 [openaddresses] no add_osm_boundaries for two of the recent Washington editions, only reverse geocode to OSM when no city is given 2016-10-20 22:46:29 -04:00
Al
d965ea9371 [openaddresses] adding hyphenation/dehyphenation to the OpenAddresses formatter 2016-10-20 20:55:17 -04:00
Al
00ebdfed7f [osm] adding alt_place_names to the shared formatting class AddressComponents and making them classmethods 2016-10-20 20:41:22 -04:00
Al
d9bc465c82 [osm] parsing out semicolon-delimited postal codes from OSM in countries like Poland that use hyphen delimited postcodes without treating them as number ranges 2016-10-19 17:46:42 -04:00
Al
91e6ca0942 [osm] adding a number of Australian city council boundaries 2016-10-19 16:34:04 -04:00
Al
cec8168279 [osm] adding council and city council to ignorable place name suffixes 2016-10-19 16:33:04 -04:00
Al
ec77a247fa [fix] just ignore records without the "name" tag 2016-10-19 13:36:15 -04:00
Al
61078eded9 [fix] checking for dictionary key 2016-10-19 13:34:13 -04:00
Al
c2b73307de [fix] parens 2016-10-19 13:29:56 -04:00
Al
f639151698 [osm] checking for non-admin_center nodes which are part of a lower admin level polygon with the same name 2016-10-19 13:27:38 -04:00
Al
e380567ac4 [osm] adding alt_place_names method which does hyphenation, de-hyphenation and abbreviated toponyms with/without hyphens 2016-10-19 02:19:09 -04:00
Al
51afc2619b [fix] only replace whitespace between words, not for instance whitespace around an existing hyphen, and reducing to one space for spaced hyphens 2016-10-19 01:24:54 -04:00
Al
78f341f4f1 [osm] higher probability of hyphenation 2016-10-19 01:11:41 -04:00
Al
e8899eafd6 [osm] adding hyphenation/de-hyphenation to OSM admin components 2016-10-19 01:00:29 -04:00
Al
98ac232eea [osm] hyphenating and de-hyphenating place names in places training data 2016-10-19 00:33:10 -04:00
Al
562caba31c [openaddresses] adding new counties in Washington 2016-10-19 00:30:50 -04:00
Al
72e7d3ff5b [addresses/hyphens] adding some methods to hyphenate/dehyphenate place names at random 2016-10-18 19:10:31 -04:00
Al
7e007a49ab [osm] removing place=district mapping globally (means city_district in Hungary) and mapping it specifically to state_district/city_district in the places where it's needed 2016-10-18 19:02:36 -04:00
Al
9384d8cc7e [osm] adding exception for Vienna 2016-10-18 02:52:09 -04:00
Al
d4f4b716a0 [openaddresses] adding new counties in Oregon 2016-10-18 02:28:31 -04:00
Al
fc9ed13bc5 [boundaries] adding Community Development Council and CDC as removable suffixes for Singapore 2016-10-17 16:04:17 -04:00
Al
d34faf42b8 [osm] fix names with pipes in them 2016-10-17 02:32:25 -04:00
Al
a796b41d90 [geonames] admin codes on geonames/postal_codes tables 2016-10-17 00:21:33 -04:00
Al
ff27ee14bb [osm] only add label props if the name property is identical (counterexample, Nottinghamshire's label is listed as West Bridgford, which is really its admin_center) 2016-10-16 22:18:52 -04:00
Al
de9e234929 [osm] adding alternate civil parish description to the UK 2016-10-16 22:04:17 -04:00
Al
876f575040 [geonames] adding 5 borough exceptions 2016-10-16 21:31:20 -04:00
Al
093e7ed120 [fix] city districts in Košice, Slovakia 2016-10-15 01:47:37 -04:00
Al
049b3c9ce1 [boundaries] city/wards for Dar es Salaam + admin_center 2016-10-15 01:47:14 -04:00
Al
c4848b113d [geonames] unindenting overrides in GeoNames configs 2016-10-15 01:46:46 -04:00
Al
c39cfec218 [boundaries] Dar es Salaam=city, wards=city_district in Tanzania 2016-10-15 01:40:00 -04:00
Al
876fdd11fa [fix] country/language codes in formatting config 2016-10-12 15:51:31 -04:00
Al
9fb936019a [geoplanet] script to create GeoPlanet postal codes training data 2016-10-12 15:05:45 -04:00
Al
1e6a00c573 [fix] place in UK that was parented by a postal_code 2016-10-12 15:00:33 -04:00
Al
1d25f08b52 [expand] adding a function to check if two place names/addresses are equivalent after token normalization (replacing hyphens, deleting final periods, lowercasing, simple transliteration, etc.) and taking into account abbreviations from any specified libpostal dictionaries. In conjunction with place name affixes, useful in data sets like GeoPlanet or GeoNames to determine if a name variant is related to the original or not 2016-10-12 14:55:59 -04:00
Al
f8664b0deb [formatting] making regex-based tests during insert_component optional.If exact_order=True, insert the given component directly before/after the reference component, otherwise for components that already exist in the template only need to care about relative position. Adding a method to determine if template language is important for a particular country/language pair. 2016-10-12 14:42:34 -04:00