Commit Graph

129 Commits

Author SHA1 Message Date
Al
6d20d7348f [osm] Using OSM namespaced tags from polygons in the case of non-local languages 2015-11-23 14:42:30 -05:00
Al
e46e1a93a0 [fix] ISO code and simple/international name checks should be on the polygons 2015-11-23 14:30:38 -05:00
Al
eb7488ab55 [fix] Making country replacement probability independent of the probability used for local vs non-local languages 2015-11-23 13:46:14 -05:00
Al
f4f7cceba2 [fix] var, non-local languages 2015-11-23 12:51:26 -05:00
Al
2b1c346fde [osm] Using name:simple and int_name to capture more variations for US addresses, adding ISO codes occationally instead of names 2015-11-23 12:35:44 -05:00
Al
2695b5dd26 [osm] Shortening state names obtained from reverse geocoding for relevant countries 2015-11-22 22:09:31 -05:00
Al
8b035814c7 [osm] Change probabilities for country names 2015-11-22 18:52:17 -05:00
Al
71afcafe11 [fix] key names 2015-11-22 17:46:56 -05:00
Al
ee482e7a07 [fix] import 2015-11-22 16:04:50 -05:00
Al
ee75ffccd5 [fix] import 2015-11-22 15:51:13 -05:00
Al
c6f531ca95 [fix] arguments 2015-11-22 15:35:25 -05:00
Al
c851cf2547 [fix] OSM R-tree 2015-11-22 15:24:35 -05:00
Al
d3703ce6b4 [fix] var name 2015-11-22 14:27:25 -05:00
Al
5b6fbd66e0 [fix] arg 2015-11-22 14:24:05 -05:00
Al
422ea668d8 [fix] import 2015-11-22 14:23:09 -05:00
Al
4cc275e313 [fix] doc and default arg 2015-11-22 14:21:20 -05:00
Al
c8f47b38a2 [osm/formatting] Adding OSM polygon lookups and neighborhood polygon lookups to the training data in order to provide more variations for the model to work with 2015-11-21 17:05:35 -05:00
Al
b948a8ebd8 [osm] Adding global keys which map to OSM address components 2015-11-20 12:48:54 -05:00
Al
946bce1cb9 [osm] Adding a few more boundary types to planet admin borders 2015-11-17 11:40:42 -05:00
Al
b3ef8ded12 [formatting] Adding OSM address components lookup by country 2015-11-17 11:39:39 -05:00
Al
c7df3fcb3a [osm] Adding a list of various OSM name tags obtained from Nominatim 2015-10-29 11:44:56 -04:00
Al
da53d7ebac [osm] Adding an OSM neighborhoods/suburbs data set for matching with Quattroshapes boundaries, updating definitions for admin boundaries 2015-10-22 11:37:11 -04:00
Al
6478e65a06 [osm] Moving Wikipedia title normalization to osm.extract 2015-10-22 11:35:38 -04:00
Al
6f6d04966b [fix] role in OSM polygon extraction 2015-10-21 16:35:25 -04:00
Al
1e8e592e0b [fix] import 2015-10-19 23:30:12 -04:00
Al
5187e6073a [fix] admin boundary imports 2015-10-19 17:14:48 -04:00
Al
8609ccbb1d [polygons/osm] lon, lat 2015-10-19 15:40:43 -04:00
Al
ef94f1b712 [doc] Adding some comments to fetch_osm_address_data.sh 2015-10-19 15:39:31 -04:00
Al
83295b1b34 [polygons/osm] Adding in-memory OSM reverse geocoder for all admin boundaries 2015-10-19 15:38:23 -04:00
Al
4a3994c65e [polygons/osm] Construct polygons from OSM relations using a number of space-saving optimizations in order to process planet in a reasonable amount of memory. Builds a graph of connected ways such that forming polygons is equivalent to finding strongly connected components. 2015-10-18 20:53:49 -04:00
Al
ade0e2dc1f [osm] Adding final .osm file variable for borders output 2015-10-16 00:46:40 -04:00
Al
b5f8b696bf [osm] Moving parse_osm to a separate module, adding option to list dependencies 2015-10-15 20:22:12 -04:00
Al
ca629e295d [osm] Adding admin boundaries filter in OSM data 2015-10-15 12:06:11 -04:00
Al
cfa57c96a3 [fix] untagged formatted addresses 2015-10-04 02:02:59 -04:00
Al
5d2a24872a [osm] Adding dependencies so single street names are not valid without at least one of {house, number, suburb, city, postcode} 2015-10-03 15:22:26 -04:00
Al
77be2fe433 [osm] Adjusting priors for country code expansion 2015-10-03 15:13:16 -04:00
Al
0b98a26426 [fix] keeping name tag in address components 2015-10-03 15:10:14 -04:00
Al
0f9ad259dc [osm] Doing initial formatting after replacing country/state 2015-10-03 14:40:38 -04:00
Al
71233c9c02 [fix] import, initialization 2015-10-03 14:37:08 -04:00
Al
1948aa87ea [fix] typo 2015-10-03 14:33:45 -04:00
Al
22efce7337 [osm/parsing] Randomly replacing country codes with local and foreign language expansions as well as randomly expanding state abbreviations to make parser more robust to different input 2015-10-03 14:31:51 -04:00
Al
db71b65412 [fix] checking validity of component combination 2015-10-02 20:28:45 -04:00
Al
a2fd6e25f8 [fix] import 2015-10-02 20:25:48 -04:00
Al
49abb70b59 [fix] dictionary 2015-10-02 20:24:21 -04:00
Al
521f33d892 [fix] bitset for address components, only looking at valid component keys 2015-10-02 20:21:59 -04:00
Al
528285f735 [fix] only OSM tagged addresses need extra logic 2015-10-02 20:18:30 -04:00
Al
83aecb9f2c [osm/parsing] Making tagged training data for address parser more robust to the types of partial input we see in geocoding by randomly eliminating components subject to some constraints (e.g. house number cannot be used without a street name) 2015-10-02 19:54:28 -04:00
Al
ca25b48687 [fix] Not writing empty fields in formatted addresses 2015-09-22 08:13:55 -04:00
Al
134cf616d6 [osm] Using street for language disambiguation in training data 2015-09-21 04:09:15 -04:00
Al
84cf21df88 [osm] Separating address formatter into its own module, adding some documentation of the various training sets with examples 2015-09-20 20:05:46 -04:00