Commit Graph

403 Commits

Author SHA1 Message Date
Al
40d18aa7f6 [polygons/osm] Switching back to buffer(0). Still destroys many polygons, may need to look into another solution 2015-11-25 17:10:50 -05:00
Al
a50c971732 [polygons/osm] Ommitting last node in every way of a connected component since that node is equal to the start node of its neighbor 2015-11-25 17:09:19 -05:00
Al
d6d5eab989 [geonames] Adding ability to lookup GeoNames alternate names (may obtain IDs from Quattroshapes). Not great for local-language primary names (OSM remains the best) but decent for extracting foreign toponyms 2015-11-25 17:07:14 -05:00
Al
3217fa39cd [fix] add country randomly in the formatted language training data in cases where country is not present 2015-11-25 14:54:41 -05:00
Al
1a6618957b [fix] Python float precision doesn't appear to be the problem 2015-11-25 11:29:08 -05:00
Al
5781813cbd [fix] For countries like Denmark, removing country with a smaller probability 2015-11-25 00:39:52 -05:00
Al
e4b8349d98 [fix] sparsity of country tags should be enough for language address training data 2015-11-25 00:32:01 -05:00
Al
824c779107 [fix] Cutting down training repeatedly on country names 2015-11-24 23:22:57 -05:00
Al
88529d28e2 [fix] country formatting in language address training data 2015-11-24 23:20:31 -05:00
Al
cd74fcda3c [fix] not requiring minimal keys in format language data 2015-11-24 23:13:28 -05:00
Al
e560e53308 [fix] formatter 2015-11-24 22:27:57 -05:00
Al
8c422a6e61 [osm] Adding new localized country names in anguage training data for formatted addresses 2015-11-24 21:49:10 -05:00
Al
e40ca0bb89 [fix] Removing house numbers from formatted address language training data, using a simple whitespace splitter 2015-11-24 21:15:22 -05:00
Al
a92cbb8003 [osm] Trying fixed-point precision in converting OSM coordinates to avoid issues with polygon self-intersection when the lines are very close together (e.g. parts of Berlin, UK country polygon) 2015-11-24 15:13:16 -05:00
Al
ef9c5c2ca1 [fix] args 2015-11-24 11:02:35 -05:00
Al
e75c1ce860 [fix] limited addresses 2015-11-24 11:01:22 -05:00
Al
94039f98ad [fix] argument validation in OSM training data script 2015-11-24 10:59:16 -05:00
Al
de9f3120c8 [polygons] Trying a slightly higher value for buffer() as suggested by this issue https://github.com/Toblerity/Shapely/issues/277 2015-11-23 15:43:23 -05:00
Al
6d20d7348f [osm] Using OSM namespaced tags from polygons in the case of non-local languages 2015-11-23 14:42:30 -05:00
Al
e46e1a93a0 [fix] ISO code and simple/international name checks should be on the polygons 2015-11-23 14:30:38 -05:00
Al
eb7488ab55 [fix] Making country replacement probability independent of the probability used for local vs non-local languages 2015-11-23 13:46:14 -05:00
Al
f4f7cceba2 [fix] var, non-local languages 2015-11-23 12:51:26 -05:00
Al
6aa640b5f0 [fix] Moving is_in:country to lower priority 2015-11-23 12:36:05 -05:00
Al
2b1c346fde [osm] Using name:simple and int_name to capture more variations for US addresses, adding ISO codes occationally instead of names 2015-11-23 12:35:44 -05:00
Al
f1b6620369 [osm/formatting] replacing keys with the highest priority so addr:* tags take precedence over is_in:* tags 2015-11-22 22:25:44 -05:00
Al
2695b5dd26 [osm] Shortening state names obtained from reverse geocoding for relevant countries 2015-11-22 22:09:31 -05:00
Al
8b035814c7 [osm] Change probabilities for country names 2015-11-22 18:52:17 -05:00
Al
04183c672e [fix] non-integer admin levels 2015-11-22 18:33:27 -05:00
Al
7ee8045a0f [fix] comparison 2015-11-22 18:27:05 -05:00
Al
efa0e38e45 [fix] another issue with tokenize API 2015-11-22 18:08:45 -05:00
Al
ce065bb9ec [fix] using new pypostal tokenize API 2015-11-22 18:01:07 -05:00
Al
71afcafe11 [fix] key names 2015-11-22 17:46:56 -05:00
Al
f77ddc71e7 [fix] reverting to old Rtree index filename 2015-11-22 17:25:51 -05:00
Al
ee482e7a07 [fix] import 2015-11-22 16:04:50 -05:00
Al
ee75ffccd5 [fix] import 2015-11-22 15:51:13 -05:00
Al
c6f531ca95 [fix] arguments 2015-11-22 15:35:25 -05:00
Al
c851cf2547 [fix] OSM R-tree 2015-11-22 15:24:35 -05:00
Al
d3703ce6b4 [fix] var name 2015-11-22 14:27:25 -05:00
Al
5b6fbd66e0 [fix] arg 2015-11-22 14:24:05 -05:00
Al
422ea668d8 [fix] import 2015-11-22 14:23:09 -05:00
Al
4f0d6fbf79 [fix] default arg again 2015-11-22 14:22:09 -05:00
Al
4cc275e313 [fix] doc and default arg 2015-11-22 14:21:20 -05:00
Al
c8f47b38a2 [osm/formatting] Adding OSM polygon lookups and neighborhood polygon lookups to the training data in order to provide more variations for the model to work with 2015-11-21 17:05:35 -05:00
Al
9fc60600dd [fix] OSM reverse geocoder polygon ordering 2015-11-20 14:49:37 -05:00
Al
130518fe58 [polygons] OSM reverse geocoder sort levels 2015-11-20 13:52:30 -05:00
Al
b948a8ebd8 [osm] Adding global keys which map to OSM address components 2015-11-20 12:48:54 -05:00
Al
85667997cd [formatting] Adding city_district and state_district tags to address formatting templates where it makes sense. These will not be in all addresses, tags can be added and removed from the training data with certain probabilities 2015-11-20 12:24:51 -05:00
Al
946bce1cb9 [osm] Adding a few more boundary types to planet admin borders 2015-11-17 11:40:42 -05:00
Al
b3ef8ded12 [formatting] Adding OSM address components lookup by country 2015-11-17 11:39:39 -05:00
Al
0b74039a6a [formatting] Adding city_district as a separate format tag 2015-11-17 11:38:38 -05:00