Commit Graph

1129 Commits

Author SHA1 Message Date
Al
9866614f63 [openaddresses] Using new config implementation, using neighborhoods/boroughs in NYC 2016-08-23 02:14:29 -04:00
Al
ed0b49884e [openaddresses] Changes to OA config utilizing some of the new cleanup options. Adding language to brussels-fr and brussels-nl, adding New York and New Jersey statewide with the understanding that OSM components will be added in NJ and postcodes will be stripped of letters in NY 2016-08-23 00:38:43 -04:00
Al
8b57a7acf2 [osm] abbreviate toponyms (qualifiers) with some probability so we get those versions in the model's phrase dictionaries 2016-08-22 20:55:35 -04:00
Al
3fef3e56d5 [boundaries] converting Mexico City boroughs to city_district 2016-08-22 03:51:01 -04:00
Al
79c9694e2d [names] Allowing for similarity-only normalization in name affixes 2016-08-22 03:47:08 -04:00
Al
72b5f6b55a [dictionaries] German dictionary updates 2016-08-22 00:11:10 -04:00
Al
b41ba7374b [intersections] intersections training data, using a Cartesian product of all names in the same language, including something like tiger:name_base 2016-08-18 01:19:14 -04:00
Al
5cff7b85bd [geonames] Adding basic GeoNames admin mappings for all countries we have postal codes lists for so some form of training data can be created for postcodes not listed in OSM 2016-08-15 01:09:17 -04:00
Al
7f4e636fc5 [fix] accidentally had Vietnam country code switched with Virgin Islands 2016-08-14 18:43:24 -04:00
Al
8a5da5f860 [boundaries/osm] Reverting admin_level=10 back to city_district for India so it'll match the current training data, can revisit later 2016-08-13 22:51:42 -04:00
Al
55895369b8 [boundaries] Using state again for UK countries (England, Scotland, Wales, Northern Ireland). country_region was created mostly for non-administrative regions of a country (usually admin_level=3 in OSM). The UK is a bit more complicated in that there are multiple non-sovereign countries, but it's probably not worth creating a different tag and different set of parameters just to have a distinct name for 1st level admin in the UK 2016-08-11 23:47:31 -04:00
Al
d51a6693ac [fix] reverting commit that was lumped in with geonames script 2016-08-11 21:49:29 -04:00
Al
74d042e3c7 [boundaries] For India, making admin_level 10 map to suburb rather than city_district 2016-08-11 21:47:10 -04:00
Al
29081a0699 [fix] adding English template insertions for the UK regardless of language 2016-08-11 21:32:54 -04:00
Al
22123b80ba [fix] refactoring geonames script a bit 2016-08-11 21:31:39 -04:00
Al
48755ec218 [boundaries] Adding regex replacements for boundary names such as Lyon 2e Arrondissement where putting Lyon is the OSM convention but we might sometimes want just 2e Arrondissement to appear in the training data next to Lyon 2016-08-11 13:09:24 -04:00
Al
10a41309b8 [addresses] Increasing Romaji probability to 0.4 2016-08-06 21:27:32 -04:00
Al
cdd5a96346 [addresses] metro station can also be used for plain venues without a house number so we get more in the training set 2016-08-06 20:52:29 -04:00
Al
195278cfea [osm] Reverse geocoding to metro station only for addresess in Japan 2016-08-06 19:50:18 -04:00
Al
da2985a4ae [places] Metro station dropout probabilities 2016-08-06 19:34:56 -04:00
Al
6ce882cb55 [addresses] Metro station component dependencies (road or house_number) 2016-08-06 19:34:39 -04:00
Al
668aa20996 [addresses] Metro station phrases for Japanese Romaji 2016-08-06 19:34:07 -04:00
Al
9cbbca5e47 [addresses] Metro station phrase for Japanese 2016-08-06 19:33:42 -04:00
Al
8b5d44e173 [fix] Japanese house numbers aren't without dependencies, just have different ones (road or suburb or city_district) 2016-08-06 03:38:44 -04:00
Al
2c024ce9f4 [addresses] special case for Japan, house_number does not depend on street name 2016-08-06 02:38:58 -04:00
Al
14c35b35c6 [fix] probabilities in Romanian address config 2016-08-04 17:53:10 -04:00
Al
f33882b7bc [fix] Swedish config for top floor phrase 2016-08-03 11:54:15 -04:00
Al
fa003ca430 [fix] indentation in boundaries configs 2016-08-01 00:52:10 -04:00
Al
5edc60299c [fix] Bulgarian category probabilities 2016-07-31 22:50:48 -04:00
Al
3ead069b1b [fix] Romanian staircase probability 2016-07-31 22:28:31 -04:00
Al
afbb79b81d [osm/parser] Making a much lower probability of generating sub-building components for named venues (usually on the ground floor, etc.) 2016-07-31 20:40:44 -04:00
Al
2e92c6fcc8 [fix] Probabilities for Ukrainian house numbers 2016-07-31 20:01:42 -04:00
Al
0827caf578 [fix] sample=true 2016-07-31 19:51:03 -04:00
Al
ce17b50064 [fix] canonical probability 2016-07-31 19:16:46 -04:00
Al
92b8566930 [places] Increase probability of state and decrease probability of county for smaller ciites/towns 2016-07-31 03:26:34 -04:00
Al
bb91a5b0f0 [places] For the US, add state_district (county) with higher probability for towns with higher populations. Helps with cases that would be difficult to get right otherwise like Brooklyn, Cattaraugus County, NY (http://www.openstreetmap.org/node/158644800) 2016-07-30 18:57:28 -04:00
Al
f8c8d05997 [fix] same thing for the exception countries 2016-07-29 12:47:08 -04:00
Al
045eab8e58 [osm] Making ISO codes lower probability for reverse geocoded country as well 2016-07-29 12:30:32 -04:00
Al
09b16d954f [osm] Use much lower probability of ISO country codes 2016-07-29 11:41:39 -04:00
Al
21bcbd8381 [fix] restoring CLDR probability 2016-07-28 15:21:44 -04:00
Al
bebb33fe64 [osm] Include CLDR country even if the place didn't match simplified OSM polygons 2016-07-28 14:11:31 -04:00
Al
543048bc26 [osm] use CLDR country names with random probability 2016-07-28 02:37:12 -04:00
Al
095c808cea [places] increasing country probabilities, state probabilities in Mexico and Brasil 2016-07-28 02:26:51 -04:00
Al
21033537a2 [fix] US insertion config 2016-07-27 19:13:59 -04:00
Al
a4a74aec7f [osm] Updating formatting config for all the languages/countries currently implemented 2016-07-27 17:45:18 -04:00
Al
750037330e [boundaries] Updated boundaries for Slovakia to capture city districts, etc. 2016-07-27 14:07:36 -04:00
Al
d9b70d3404 [fix] mapping the nodes for NYC boroughs to city_district 2016-07-27 12:22:50 -04:00
Al
53cbb52cb2 [languages] Adding Tibetan language to regional languages for the Tibet region 2016-07-26 19:07:37 -04:00
Al
eae7a6a78c [osm/boundaries] extend admin overrides in the UK to Greater London which includes London and the City of London 2016-07-25 16:56:39 -04:00
Al
38e67f5013 [boundaries] More fun with mapping UK admin boundaries. Non-metroplitan counties and non-metropolitan districts map to state_district. admin_level=6 maps to state district except for London where it's the city minus City of London. admin_level=8 (e.g. Manchester) maps to city except in London where it maps to city_district. admin_level=10 is suburb unless designation=civil_parish, in which case it's treated as a city boundary (individual towns/villages may be city or suburb depending on their place tag). Just complicated enough to be valid UK law :-). 2016-07-25 16:02:00 -04:00