Commit Graph

1162 Commits

Author SHA1 Message Date
Al
e771e9b2d5 [openaddresses] Adding Wisconsin (30th state in the union) 2016-08-26 18:37:48 -04:00
Al
0016cdbd7f [openaddresses] Adding Iowa (29th state in the union) 2016-08-26 18:13:50 -04:00
Al
0ec9593e6c [openaddresses] Adding Texas (28th state in the union, however reluctantly) 2016-08-26 17:42:31 -04:00
Al
79b4e0be90 [dictionaries] service road abbreviations 2016-08-26 16:46:02 -04:00
Al
d696c792ae [openaddresses] Adding Florida (27th state in the union) 2016-08-26 16:35:04 -04:00
Al
2654683af4 [openaddresses] Adding quick-and-dirty regex-based exclusion list for fields containing various patterns in OpenAddresses, to be used sparingly 2016-08-26 15:35:51 -04:00
Al
7bcddeff44 [openaddresses] Adding Michigan (26th state in the union) 2016-08-26 13:42:13 -04:00
Al
755a65aa14 [openaddresses] Adding Arkansas (25th state in the union) 2016-08-26 13:36:25 -04:00
Al
d97bb9cd4c [openaddresses] Adding Missouri (24th state in the union) 2016-08-26 13:36:09 -04:00
Al
d4e76eac0b [openaddresses] Adding Alabama (22nd state in the Union) 2016-08-26 13:12:39 -04:00
Al
aa26277136 [openaddresses] Adding Illinois (21st state in the union) 2016-08-26 13:08:06 -04:00
Al
a11abf2787 [openaddresses] Adding Mississippi (20th state in the union) 2016-08-26 10:58:16 -04:00
Al
ebeb7f816a [openaddresses] Adding Indiana (19th state in the union) 2016-08-26 10:48:05 -04:00
Al
472580320d [dictionaries] English synonyms update 2016-08-26 10:19:51 -04:00
Al
3a8dee523d [openaddresses] Adding Louisiana (18th state in the union) 2016-08-25 22:50:18 -04:00
Al
9aea4451ff [openaddresses] Adding Ohio (17th state in the union) 2016-08-25 22:01:57 -04:00
Al
0b19f27d8d [openaddresses] Adding Tennessee (16th state in the union) 2016-08-25 18:55:54 -04:00
Al
59a840ab37 [openaddresses] Adding Kentucky (15th state in the union) 2016-08-25 18:38:53 -04:00
Al
dc6e483067 [openaddresses] Adding DC (not a state, but in after the original 13 colonies) 2016-08-25 18:11:54 -04:00
Al
e251fc42fa [openaddresses] Adding North Carolina (12th state in the union) 2016-08-25 18:08:19 -04:00
Al
2009b4c992 [openaddresses] Adding Virginia (10th state in the union) 2016-08-25 16:37:39 -04:00
Al
93b377c8a7 [openaddresses] Fixes for California, have to remove Orange County because it's all being stuffed into the street field 2016-08-25 14:39:45 -04:00
Al
b75419d6e8 [boundaries] Luxembourg quarters = city_district 2016-08-24 23:37:44 -04:00
Al
14bc224f25 [openaddresses] Adding OSM neighborhoods across the US wherever we have them. That index is relatively small and cheap to do lookups for every point whereas the general R-tree should be used only when necessary 2016-08-24 14:58:19 -04:00
Al
4552aa380c [openaddresses] Adding South Carolina 2016-08-24 14:47:07 -04:00
Al
84bb12657b [dictionaries] Adding a variety of abbreviations/misspellings for street, road, drive, and place 2016-08-24 14:19:40 -04:00
Al
709cecd300 [dictionaries] Adding some MLK synonyms after looking at the Georgia data 2016-08-24 14:18:44 -04:00
Al
f66fb4a172 [openaddresses] Adding Maryland 2016-08-24 13:54:40 -04:00
Al
f9ec02c8e0 [openaddresses] Adding Georgia. There's a lot of weirdness in there so whitelisting files. Files that weren't added were deliberate 2016-08-24 13:52:35 -04:00
Al
ad625a46a4 [openaddresses] Adding Delaware and Pennsylvania. Going with the "older states in the union will have funkier addresses" strategy. 2016-08-23 22:22:35 -04:00
Al
f36ca6a788 [dictionaries] Adding Asturian language and dictionaries for the Asturias region of Spain. Realized some of the default street names/addresses in Oviedo, etc. are actually Asturian rather than Spanish 2016-08-23 21:52:22 -04:00
Al
ff06462981 [dictionaries] Adding oberste etage, unterste etage and parkdeck to German dictionaries. Generating as part of the sub-building info for the address parser 2016-08-23 21:49:44 -04:00
Al
e746cbab75 [openaddresses] Adding New England states (postcodes beginning with 0). 2016-08-23 02:51:20 -04:00
Al
9866614f63 [openaddresses] Using new config implementation, using neighborhoods/boroughs in NYC 2016-08-23 02:14:29 -04:00
Al
ed0b49884e [openaddresses] Changes to OA config utilizing some of the new cleanup options. Adding language to brussels-fr and brussels-nl, adding New York and New Jersey statewide with the understanding that OSM components will be added in NJ and postcodes will be stripped of letters in NY 2016-08-23 00:38:43 -04:00
Al
8b57a7acf2 [osm] abbreviate toponyms (qualifiers) with some probability so we get those versions in the model's phrase dictionaries 2016-08-22 20:55:35 -04:00
Al
3fef3e56d5 [boundaries] converting Mexico City boroughs to city_district 2016-08-22 03:51:01 -04:00
Al
79c9694e2d [names] Allowing for similarity-only normalization in name affixes 2016-08-22 03:47:08 -04:00
Al
72b5f6b55a [dictionaries] German dictionary updates 2016-08-22 00:11:10 -04:00
Al
b41ba7374b [intersections] intersections training data, using a Cartesian product of all names in the same language, including something like tiger:name_base 2016-08-18 01:19:14 -04:00
Al
5cff7b85bd [geonames] Adding basic GeoNames admin mappings for all countries we have postal codes lists for so some form of training data can be created for postcodes not listed in OSM 2016-08-15 01:09:17 -04:00
Al
7f4e636fc5 [fix] accidentally had Vietnam country code switched with Virgin Islands 2016-08-14 18:43:24 -04:00
Al
8a5da5f860 [boundaries/osm] Reverting admin_level=10 back to city_district for India so it'll match the current training data, can revisit later 2016-08-13 22:51:42 -04:00
Al
55895369b8 [boundaries] Using state again for UK countries (England, Scotland, Wales, Northern Ireland). country_region was created mostly for non-administrative regions of a country (usually admin_level=3 in OSM). The UK is a bit more complicated in that there are multiple non-sovereign countries, but it's probably not worth creating a different tag and different set of parameters just to have a distinct name for 1st level admin in the UK 2016-08-11 23:47:31 -04:00
Al
d51a6693ac [fix] reverting commit that was lumped in with geonames script 2016-08-11 21:49:29 -04:00
Al
74d042e3c7 [boundaries] For India, making admin_level 10 map to suburb rather than city_district 2016-08-11 21:47:10 -04:00
Al
29081a0699 [fix] adding English template insertions for the UK regardless of language 2016-08-11 21:32:54 -04:00
Al
22123b80ba [fix] refactoring geonames script a bit 2016-08-11 21:31:39 -04:00
Al
48755ec218 [boundaries] Adding regex replacements for boundary names such as Lyon 2e Arrondissement where putting Lyon is the OSM convention but we might sometimes want just 2e Arrondissement to appear in the training data next to Lyon 2016-08-11 13:09:24 -04:00
Al
10a41309b8 [addresses] Increasing Romaji probability to 0.4 2016-08-06 21:27:32 -04:00