Commit Graph

521 Commits

Author SHA1 Message Date
Al
69ba631dc9 [docs] updating params in OSM training data docs 2015-11-28 01:09:14 -05:00
Al
3cd1fee89d [fix] KeyError 2015-11-27 14:40:11 -05:00
Al
a77bc03977 [fix] language 2015-11-27 14:24:32 -05:00
Al
38d4e2d67a [fix] cities 2015-11-27 14:05:53 -05:00
Al
3cf98770e3 [fix] var name 2015-11-27 13:54:38 -05:00
Al
2e0f35b13a [fix] key checks for Quattroshapes cities, removing city in non-local language case 2015-11-27 13:45:51 -05:00
Al
105ba313c5 [fix] var name 2015-11-27 12:00:11 -05:00
Al
3eea355352 [fix] argument order 2015-11-27 11:47:39 -05:00
Al
51f6a82727 [fix] import again 2015-11-27 11:38:40 -05:00
Al
644eeb74c6 [fix] import 2015-11-27 11:17:53 -05:00
Al
2830986073 [osm/formatting] Adding in cities from Quattroshapes/GeoNames in the case of non-local languages or in general with a small random probability 2015-11-27 11:09:12 -05:00
Al
b0667d0032 [fix] only care about levels in Quattroshapes index, not Zetashapes 2015-11-26 23:45:50 -05:00
Al
0eb0042826 [fix] Same in neighborhoods reverse geocoder lookups 2015-11-26 14:17:17 -05:00
Al
4170f6e9e3 [fix] same options for geohash-based index 2015-11-26 14:14:53 -05:00
Al
4cff1f8a9d [fix] Quattroshapes neighborhoods index uses geohashes for slightly better coverage 2015-11-26 12:45:54 -05:00
Al
98d8054a2b [polygons/quattroshapes] Converting Quattroshapes lookups to an R-tree index 2015-11-25 19:37:57 -05:00
Al
8a8e45f2a6 [fix] filenames 2015-11-25 18:08:04 -05:00
Al
bd88628a98 [polygons/quattroshapes] Removing local admin and neighborhoods from the Quattroshapes reverse geocoder since they're covered in neighborhoods 2015-11-25 18:06:14 -05:00
Al
40d18aa7f6 [polygons/osm] Switching back to buffer(0). Still destroys many polygons, may need to look into another solution 2015-11-25 17:10:50 -05:00
Al
a50c971732 [polygons/osm] Ommitting last node in every way of a connected component since that node is equal to the start node of its neighbor 2015-11-25 17:09:19 -05:00
Al
d6d5eab989 [geonames] Adding ability to lookup GeoNames alternate names (may obtain IDs from Quattroshapes). Not great for local-language primary names (OSM remains the best) but decent for extracting foreign toponyms 2015-11-25 17:07:14 -05:00
Al
3217fa39cd [fix] add country randomly in the formatted language training data in cases where country is not present 2015-11-25 14:54:41 -05:00
Al
1a6618957b [fix] Python float precision doesn't appear to be the problem 2015-11-25 11:29:08 -05:00
Al
5781813cbd [fix] For countries like Denmark, removing country with a smaller probability 2015-11-25 00:39:52 -05:00
Al
e4b8349d98 [fix] sparsity of country tags should be enough for language address training data 2015-11-25 00:32:01 -05:00
Al
824c779107 [fix] Cutting down training repeatedly on country names 2015-11-24 23:22:57 -05:00
Al
88529d28e2 [fix] country formatting in language address training data 2015-11-24 23:20:31 -05:00
Al
cd74fcda3c [fix] not requiring minimal keys in format language data 2015-11-24 23:13:28 -05:00
Al
e560e53308 [fix] formatter 2015-11-24 22:27:57 -05:00
Al
8c422a6e61 [osm] Adding new localized country names in anguage training data for formatted addresses 2015-11-24 21:49:10 -05:00
Al
e40ca0bb89 [fix] Removing house numbers from formatted address language training data, using a simple whitespace splitter 2015-11-24 21:15:22 -05:00
Al
a92cbb8003 [osm] Trying fixed-point precision in converting OSM coordinates to avoid issues with polygon self-intersection when the lines are very close together (e.g. parts of Berlin, UK country polygon) 2015-11-24 15:13:16 -05:00
Al
ef9c5c2ca1 [fix] args 2015-11-24 11:02:35 -05:00
Al
e75c1ce860 [fix] limited addresses 2015-11-24 11:01:22 -05:00
Al
94039f98ad [fix] argument validation in OSM training data script 2015-11-24 10:59:16 -05:00
Al
de9f3120c8 [polygons] Trying a slightly higher value for buffer() as suggested by this issue https://github.com/Toblerity/Shapely/issues/277 2015-11-23 15:43:23 -05:00
Al
6d20d7348f [osm] Using OSM namespaced tags from polygons in the case of non-local languages 2015-11-23 14:42:30 -05:00
Al
e46e1a93a0 [fix] ISO code and simple/international name checks should be on the polygons 2015-11-23 14:30:38 -05:00
Al
eb7488ab55 [fix] Making country replacement probability independent of the probability used for local vs non-local languages 2015-11-23 13:46:14 -05:00
Al
f4f7cceba2 [fix] var, non-local languages 2015-11-23 12:51:26 -05:00
Al
6aa640b5f0 [fix] Moving is_in:country to lower priority 2015-11-23 12:36:05 -05:00
Al
2b1c346fde [osm] Using name:simple and int_name to capture more variations for US addresses, adding ISO codes occationally instead of names 2015-11-23 12:35:44 -05:00
Al
f1b6620369 [osm/formatting] replacing keys with the highest priority so addr:* tags take precedence over is_in:* tags 2015-11-22 22:25:44 -05:00
Al
2695b5dd26 [osm] Shortening state names obtained from reverse geocoding for relevant countries 2015-11-22 22:09:31 -05:00
Al
8b035814c7 [osm] Change probabilities for country names 2015-11-22 18:52:17 -05:00
Al
04183c672e [fix] non-integer admin levels 2015-11-22 18:33:27 -05:00
Al
7ee8045a0f [fix] comparison 2015-11-22 18:27:05 -05:00
Al
efa0e38e45 [fix] another issue with tokenize API 2015-11-22 18:08:45 -05:00
Al
ce065bb9ec [fix] using new pypostal tokenize API 2015-11-22 18:01:07 -05:00
Al
71afcafe11 [fix] key names 2015-11-22 17:46:56 -05:00