Commit Graph

3940 Commits

Author SHA1 Message Date
Al
ff27ee14bb [osm] only add label props if the name property is identical (counterexample, Nottinghamshire's label is listed as West Bridgford, which is really its admin_center) 2016-10-16 22:18:52 -04:00
Al
de9e234929 [osm] adding alternate civil parish description to the UK 2016-10-16 22:04:17 -04:00
Al
876f575040 [geonames] adding 5 borough exceptions 2016-10-16 21:31:20 -04:00
Al
093e7ed120 [fix] city districts in Košice, Slovakia 2016-10-15 01:47:37 -04:00
Al
049b3c9ce1 [boundaries] city/wards for Dar es Salaam + admin_center 2016-10-15 01:47:14 -04:00
Al
c4848b113d [geonames] unindenting overrides in GeoNames configs 2016-10-15 01:46:46 -04:00
Al
c39cfec218 [boundaries] Dar es Salaam=city, wards=city_district in Tanzania 2016-10-15 01:40:00 -04:00
Al
876fdd11fa [fix] country/language codes in formatting config 2016-10-12 15:51:31 -04:00
Al
9fb936019a [geoplanet] script to create GeoPlanet postal codes training data 2016-10-12 15:05:45 -04:00
Al
1e6a00c573 [fix] place in UK that was parented by a postal_code 2016-10-12 15:00:33 -04:00
Al
1d25f08b52 [expand] adding a function to check if two place names/addresses are equivalent after token normalization (replacing hyphens, deleting final periods, lowercasing, simple transliteration, etc.) and taking into account abbreviations from any specified libpostal dictionaries. In conjunction with place name affixes, useful in data sets like GeoPlanet or GeoNames to determine if a name variant is related to the original or not 2016-10-12 14:55:59 -04:00
Al
f8664b0deb [formatting] making regex-based tests during insert_component optional.If exact_order=True, insert the given component directly before/after the reference component, otherwise for components that already exist in the template only need to care about relative position. Adding a method to determine if template language is important for a particular country/language pair. 2016-10-12 14:42:34 -04:00
Al
3db6b7fbf1 [dictionaries] adding new abbreviations for Sankt in German and Scandinavian languages 2016-10-11 18:05:11 -04:00
Al
2663b81670 [address_formatting] caching parsed templates from pystache yields about a 2.5x speedup per call, should shave off several hours of CPU time for large training sets 2016-10-11 15:36:49 -04:00
Al
2314acef1b [geoplanet] bypassing Québec as a county (just city and state) 2016-10-11 02:33:27 -04:00
Al
02fc172b5c [geoplanet] abbreviations for UK and NYC, fixing country codes for IM, GG and JE 2016-10-11 02:11:26 -04:00
Al
6ff1024c02 [fix] null candidate languages 2016-10-07 19:49:32 -04:00
Al
30074524d8 [fix] return empty list for languages in country_and_languages 2016-10-07 18:57:22 -04:00
Al
29698781cb [boundaries] making Kingston parish a city and only using the name Kingston, just so the parser doesn't have to disambiguate between references to the parish vs. the city, both referred to as Kingston 2016-10-07 18:52:46 -04:00
Al
ff7fec6ed1 [osm/polygons] need to include id/type in polygon properties now that they're getting added earlier in the pipeline 2016-10-07 01:21:02 -04:00
Al
169a3c3d70 [osm] drop postcode as well for address-only format 2016-10-07 01:10:16 -04:00
Al
4ff3f50e01 [fix] Dublin postcode formatting 2016-10-07 01:06:37 -04:00
Al
2e8b6e6a29 [fix] args 2016-10-07 01:03:22 -04:00
Al
0401a04adb [osm] add address-only formats (sans place tags) for every address as well to better deal handle incomplete queries where location is expected to be inferred by the geocoder, etc. 2016-10-07 00:59:52 -04:00
Al
ed26d8e398 [geoplanet] a few more GeoPlanet fixes for LocalAdmins in LU and CH 2016-10-07 00:34:57 -04:00
Al
ecd71ee10d [fix] var name 2016-10-06 15:36:51 -04:00
Al
c44e6280b4 [geoplanet] Setting postal codes connected to non-admin features to parent/grandparent features. Setting postal codes connected to unitary authorities in the UK to their respective towns 2016-10-06 14:07:01 -04:00
Al
aff12106c4 [geoplanet] adding Island place_type 2016-10-06 14:04:28 -04:00
Al
3d021c0a2c [boundaries] place=district for Ireland postal districts 2016-10-06 12:18:39 -04:00
Al
b1f386cb11 [fix] typo 2016-10-06 01:37:42 -04:00
Al
7d5ef87348 [fix] geoplanet zip file 2016-10-06 01:37:30 -04:00
Al
a67efcffe4 [addresses] add new option to use city population to determine whether components should be dropped out 2016-10-05 18:16:25 -04:00
Al
66af532850 [osm] adding country-specific cleanups to OSM place training data 2016-10-05 17:13:13 -04:00
Al
6b0186782d [openaddresses] doing country-specific cleanups in OpenAddresses 2016-10-05 17:07:29 -04:00
Al
182c0b3d26 [addresses] adding country-specific cleanups for Kingston (city=Kingston 12 split into city=Kingston, postcode=12) and Dublin (e.g. Dublin 3 specified various ways will be treated as a city_district, whereas Eirecodes are treated as postal codes) 2016-10-05 17:05:24 -04:00
Al
2798420fdc [osm] add boundary=postal_district to admin borders for Ireland 2016-10-05 15:26:16 -04:00
Al
4cea9ff54e [boundaries] map postal_district (Dublin 3, etc.) to city_district. Eire codes will be postal code 2016-10-05 15:25:13 -04:00
Al
918e1f62ba [names] remove "County" as an ignorable prefix 2016-10-05 15:03:18 -04:00
Al
7b3a59878c [fix] bracket 2016-10-05 14:27:24 -04:00
Al
fb6909970e [openaddresses] adding Colusa and Inyo counties in California 2016-10-05 13:43:43 -04:00
Al
5744fc5a3c [fix] import 2016-10-05 03:23:34 -04:00
Al
70a5ded45c [fix] encode element id 2016-10-05 03:14:19 -04:00
Al
432f9dd42e [fix] format of candidate_languages in the new OSM rtree 2016-10-05 03:12:07 -04:00
Al
bb32253689 [fix] args 2016-10-05 02:54:52 -04:00
Al
faf418decb [languages] using country_and_languages method in OSM, neighborhoods and OpenAddresses 2016-10-05 02:49:55 -04:00
Al
98a8d898a1 [fix] import 2016-10-05 01:39:34 -04:00
Al
b1cd7fdc4a [osm] adding type/id to properties dict earlier in the pipeline 2016-10-05 00:53:32 -04:00
Al
6081df0cd1 [osm] adding admin1 ids to the OSM country rtree 2016-10-04 23:12:15 -04:00
Al
2bc109f519 [osm] adding an OSM-based country_and_languages method 2016-10-04 18:41:00 -04:00
Al
aea67c0769 [osm] adding admin1 exceptions to the country polygons 2016-10-04 18:33:52 -04:00