Commit Graph

28 Commits

Author SHA1 Message Date
Al
81c59e116a [countries] use ISO 3166 country name 5% of the time for general addresses, 10% of the time for OpenAddresses. Gives the parser examples of names like "Korea, Republic of" in #168 2017-03-25 19:41:59 -04:00
Al
06003dfbb0 [fix] lower probability of name:prefix 2017-02-14 18:57:31 -05:00
Al
08976c772e [neighborhoods] base parser config changes for new prefix/first_match options 2017-02-14 18:19:15 -05:00
Al
7ee44a584b [fix] genitive case for Russian/Ukrainian toponyms, not locative (#125) 2016-12-28 14:34:28 -05:00
Al
6f009fb8a6 [addresses] adding pymorphy2 for converting Russian and Ukrainian place names (sticking with state and staet_district for the moment) to the locative case as mentioned in #125 2016-12-28 04:48:32 -05:00
Al
1cba89a99b [addresses] higher state abbreviation probability for places that use abbreviations 2016-12-20 16:53:59 -05:00
Al
846b88cde5 [addresses] let the place config take care of adding/removing neighborhoods rather than doing it as part of the add_neighborhoods method 2016-12-14 03:15:07 -05:00
Al
7a89c6e9ce [osm] removing dependencies for house/venue name (purely numeric names taken care of in osm formatter) 2016-11-18 18:32:44 -05:00
Al
78f341f4f1 [osm] higher probability of hyphenation 2016-10-19 01:11:41 -04:00
Al
72e7d3ff5b [addresses/hyphens] adding some methods to hyphenate/dehyphenate place names at random 2016-10-18 19:10:31 -04:00
Al
d43d14a34d [parser] adding state_district as one of the possible contexts for venue name (name + county could be fine as an address in some places) 2016-09-16 01:01:00 -04:00
Al
db8f5b717c [boundaries] adding use_admin_center to boundary configs right alongside other overrides 2016-09-02 02:00:18 -04:00
Al
21648c39e0 [boundaries] Bogotá should take its properties from the admin_center 2016-08-26 23:42:47 -04:00
Al
8b57a7acf2 [osm] abbreviate toponyms (qualifiers) with some probability so we get those versions in the model's phrase dictionaries 2016-08-22 20:55:35 -04:00
Al
cdd5a96346 [addresses] metro station can also be used for plain venues without a house number so we get more in the training set 2016-08-06 20:52:29 -04:00
Al
6ce882cb55 [addresses] Metro station component dependencies (road or house_number) 2016-08-06 19:34:39 -04:00
Al
8b5d44e173 [fix] Japanese house numbers aren't without dependencies, just have different ones (road or suburb or city_district) 2016-08-06 03:38:44 -04:00
Al
2c024ce9f4 [addresses] special case for Japan, house_number does not depend on street name 2016-08-06 02:38:58 -04:00
Al
09b16d954f [osm] Use much lower probability of ISO country codes 2016-07-29 11:41:39 -04:00
Al
1058b17a61 [osm] Moving admin_center overrides to OSM parser config 2016-07-25 02:02:48 -04:00
Al
90a2f2b2e0 [parser] road has no dependencies 2016-07-21 17:04:57 -04:00
Al
366c4995af [parser] lower full-name probability for states 2016-07-21 17:04:57 -04:00
Al
308080f6ee [formatting] Moving language country overrides to formatter config so actual language is retained 2016-07-21 17:04:57 -04:00
Al
890268aa87 [languages] Use English formats for Romanized CJK 2016-07-21 17:04:57 -04:00
Al
a5331f7107 [osm] Venue name depends on one of {house_number, road, suburb, city_district, city, postcode} 2016-07-21 17:04:57 -04:00
Al
6fc6f9f591 [addresses] Adding address-level component dropout to AddressComponents (returns an ordering so the client formatter can potentially emit multiple addresses with different components dropped out). Adding PO box and category probabilities to config 2016-07-21 17:04:57 -04:00
Al
f468ab84d2 [parser] Removing island exceptions from parser default config 2016-07-21 17:04:57 -04:00
Al
62b35b318f [parser] Parser default config 2016-07-21 17:04:57 -04:00