Commit Graph

1886 Commits

Author SHA1 Message Date
Al
bec569adaa [osm] adding new validity check to venue names so if the Jaccard(name tokens, street & house numer tokens) == 1 and the address does not have a known venue type e.g. a restaurant, the "venue name" is actually just the street address and can be discarded 2017-01-11 16:23:42 -05:00
Al
7f851810d2 [addresses] formatting addresses in Brasilia, so e.g. "Bloco B" is never part of the street name or building name, it's the house number. place=neighbourhood maps to nothing in Brasilia as these are basically subdivisions whose streets are identically named 2017-01-11 16:18:04 -05:00
Al
0d030a98c5 [osm] adding airport polygon index 2017-01-11 04:25:54 -05:00
Al
d528095984 [addresses] adding random unit numbers with more digits 2017-01-11 04:24:35 -05:00
Al
979fd16215 [osm] adding airports and terminals data sets with points and polygons, more file cleanup in OSM fetch script 2017-01-10 16:20:32 -05:00
Al
86c7b7f3fe [addresses] no longer normalizing slashes in boundary names for places that have multilingual names, etc. 2017-01-08 12:41:51 -05:00
Al
a6d94f998b [addresses] stripping parentheticals in admin boundary names as sometimes cities in e.g. Switzerland are like Oberwil (ZG) in OSM 2017-01-08 03:43:22 -05:00
Al
828b67d4f7 [osm] adding some new training data for simple road names and their surrounding admin boundaries 2017-01-07 15:34:43 -05:00
Al
d51f9dbb0e [addresses] stripping unit phrases from streets in OpenAddresses as well, return value wasn't getting used before 2017-01-06 10:19:08 -05:00
Al
cfdef1788c [addresses] stripping unit from street using the libpostal dictionaries in all the address data sets. Happens surprisingly often in OpenStreetMap as well as OpenAddresses 2017-01-06 10:06:23 -05:00
Al
321f2034d2 [fix] unidata file 2017-01-05 04:24:33 -05:00
Al
25723fcea2 [transliteration] making the custom rules in transliteration less repetitious and accessible from elsewhere, removing string names for common transliterators and using constants 2017-01-05 04:06:51 -05:00
Al
de2dffa315 [addresses] adding Calle to purely numeric Spanish street names in OSM as well 2017-01-02 23:41:01 -05:00
Al
600b40d2f6 [transliteration] adding german-ascii transliteration to Estonian to handle umlauts (ä => ae, etc.) 2017-01-02 13:51:56 -05:00
Al
b2b7f6f155 [osm] add wikipedia:* to rail station exception 2017-01-02 13:13:42 -05:00
Al
400ea589ef [normalize] add NORMALIZE_STRING_SIMPLE_LATIN_ASCII option to pynormalize 2017-01-02 02:08:54 -05:00
Al
2d077699e6 [places] adding is_in property to the set of tags for the places index. This may allow us to make more granular exceptions for node-based places that are actually suburbs but classified as {hamlet, village, locality, town}, etc. if the is_in contains a city that's also a boundary or nearby point 2016-12-29 14:04:13 -05:00
Al
21a2a7419a [addresses] only add village as city component if no city can be found in the area 2016-12-29 13:41:05 -05:00
Al
f58ebbdf7f [fix] var name 2016-12-28 14:37:00 -05:00
Al
7ee44a584b [fix] genitive case for Russian/Ukrainian toponyms, not locative (#125) 2016-12-28 14:34:28 -05:00
Al
e6e4b28e43 [addresses] making the город/г. prefix apply to the Russian language rather than the country 2016-12-28 13:26:19 -05:00
Al
f995fdf9d2 [fix] default None 2016-12-28 05:09:15 -05:00
Al
3dc6a69bf5 [openaddresses] adding locative names in OpenAddresses as well, which contains some Ukraine data sets 2016-12-28 04:59:55 -05:00
Al
91013fe296 [fix] moving checks inside the add_locatives function, fixing float cast 2016-12-28 04:59:27 -05:00
Al
6f009fb8a6 [addresses] adding pymorphy2 for converting Russian and Ukrainian place names (sticking with state and staet_district for the moment) to the locative case as mentioned in #125 2016-12-28 04:48:32 -05:00
Al
4344c5fdf3 [formatting] adding non-zero invert probabilities to all the former Soviet states. Other template insertions can still apply afterward for #125 2016-12-27 23:25:49 -05:00
Al
25e966411d [formatting] adding the ability to invert the address template (line by line, preserving order within each line) with certain probabilities 2016-12-27 23:25:49 -05:00
Al
165056ccd8 [names] adding configurable prefix/suffix additions for boundary names 2016-12-27 20:32:23 -05:00
Al
80a9c1b308 [addresses] move country-specific cleanups to before reverse geocoding as those deal with the user-specified components 2016-12-27 04:19:57 -05:00
Al
6163dbae39 [osm/places] adding option to only format place tags for city and smaller admins, using for polygons as larger polys should be included elsewhere anyway 2016-12-27 03:37:15 -05:00
Al
6eee689685 [fix] only applying separator tag to commas 2016-12-27 03:16:04 -05:00
Al
76d8fc1d37 [fix] combined components 2016-12-26 21:35:27 -05:00
Al
c3bf63bc18 [fix] remove reference to ftfy in the formatter 2016-12-26 21:25:28 -05:00
Al
8abbb273b2 [osm] adding the excellent ftfy (https://github.com/LuminosoInsight/python-ftfy) to fix Mojibake, etc. in address components 2016-12-26 21:18:14 -05:00
Al
7ec368542b [formatting] giving single hyphens the separator tag 2016-12-26 21:00:25 -05:00
Al
d208397ecb [addresses] checking if component is generated in combining fields 2016-12-26 16:58:10 -05:00
Al
afe29abf6c [fix] name 2016-12-25 11:38:18 -05:00
Al
e31ab33fe0 [fix] kwarg 2016-12-25 03:19:41 -05:00
Al
6a852f02bd [fix] var 2016-12-25 03:17:05 -05:00
Al
a185441ffa [osm] adding amenity=post_office to the generic place types (shouldn't be added as venue unless there's a known phrase in the name) 2016-12-25 03:14:07 -05:00
Al
4edaca7d37 [fix] var name 2016-12-25 02:50:29 -05:00
Al
4cf40f8deb [addresses] sort combined Japanese suburbs by admin level 2016-12-25 02:29:06 -05:00
Al
5b5a3fe235 [fix] adding Taiwan, Hong Kong, and Macao to the CJK patterns since language affects the order 2016-12-25 01:20:59 -05:00
Al
6da092e144 [fix] update language to English when using English names in CJK countries 2016-12-25 01:18:54 -05:00
Al
51802035de [fix] var name 2016-12-25 01:08:44 -05:00
Al
11dc8c9f24 [fix] non-dict keys in OSM boundary configs 2016-12-25 00:49:57 -05:00
Al
57f17a5d38 [addresses] remove generated components in combined house numbers if the other components were not numeric. Add house number phrase after the units, etc. are generated so it may be applied to a combined house number as well 2016-12-25 00:44:48 -05:00
Al
dad57dc57e [fix] moving CJK check into the if block so language gets changed more often even if the street sign-based language is unk 2016-12-24 21:20:38 -05:00
Al
826cbc7f24 [addresses/JP] more checks for matching major/minor neighborhood polygons with nodes in Japan 2016-12-24 20:21:25 -05:00
Al
e4e86261d1 [addresses/JP] just remove addr:neighborhood, addr:quarter, etc. in Japan as they're not applied consistently outside of cities 2016-12-24 20:13:48 -05:00