Commit Graph

509 Commits

Author SHA1 Message Date
Al
ca6d802a43 [languages] Moving language id methods into a separate package 2015-08-21 08:00:56 -04:00
Al
9d2f7e4bd1 [fix] var name 2015-08-18 16:20:12 -04:00
Al
0528d1b578 [osm] OSM untagged formatted addresses try to use language namespaced tags 2015-08-18 16:18:27 -04:00
Al
c09cb4dd82 [osm] OSM untagged formatted addresses now use the new language labeling scheme 2015-08-18 15:13:10 -04:00
Al
3daba2ddcd [fix] removing debug print 2015-08-18 13:22:48 -04:00
Al
ffe76f0403 [languages/osm] Checking for existence of separable prefix/suffix in the given dictionaries 2015-08-18 12:10:06 -04:00
Al
0e00625dbd [languages/osm] Adding a primitive phrase dictionary to the OSM training data construction script and a few heuristics to help disambiguate in the case of small local language groups that may not be specified with name:lang tags e.g. Occitan, Catalan, Basque, Galician, etc. Also throwing away ambiguous multilanguage names 2015-08-18 11:12:27 -04:00
Al
b72d9af7dc [fix] items 2015-08-18 04:17:34 -04:00
Al
f3bb3c8356 [fix] getter 2015-08-18 04:13:19 -04:00
Al
ebd5e96bd7 [fix] name 2015-08-18 04:05:04 -04:00
Al
b5be1e8df5 [fix] var name 2015-08-18 03:56:23 -04:00
Al
e84f932042 [fix] language polys 2015-08-18 03:51:30 -04:00
Al
bada7fd13b [polygons] Changes to languages polygons to support new regional language handling 2015-08-18 03:27:11 -04:00
Al
d97c725bbc [languages] Allowing specification of multiple regional languages 2015-08-18 03:18:52 -04:00
Al
89071ea21a [osm] Omitting country in limited address data set (often abbreviated, doesn't convey language as well) 2015-08-15 03:25:45 -04:00
Al
c505260912 [fix] var name 2015-08-15 02:47:31 -04:00
Al
548ce79b99 [fix] street addresses by language 2015-08-15 02:44:04 -04:00
Al
74a751ce0a [osm] Adding a new OSM training data option for writing out full formatted addresses without place names 2015-08-15 02:39:49 -04:00
Al
05b8f555d5 [fix] language polygon index 2015-08-14 21:22:15 -04:00
Al
0e92abd53e [osm] Adding building tag to venues training set construction 2015-08-14 21:07:07 -04:00
Al
191c0e3ce5 [languages] Changing Bonaire's default road sign language to Papiamento to help distinguish from Dutch 2015-08-14 21:06:16 -04:00
Al
cad1f95bbb [osm] Making minimal_only the default in formatted addresses, expanding list of acceptable combinations of address fields 2015-08-14 10:21:17 -04:00
Al
1e936ac9dc [fix] road+house_number as minimal keys for formatting addresses 2015-08-14 04:09:51 -04:00
Al
83bbd67c9c [fix] param 2015-08-14 00:57:17 -04:00
Al
e993ddcb51 [fix] splitter 2015-08-14 00:54:06 -04:00
Al
dc2766ae5d [fix] __init__ 2015-08-14 00:49:06 -04:00
Al
62c67aa970 [osm] Using pipe splitter for address components 2015-08-14 00:45:49 -04:00
Al
2bd763be03 [osm] Prefer amenity tag, skip if the building tag is simply building=yes 2015-08-13 21:16:34 -04:00
Al
c844d0484a [fix] carriage returns 2015-08-13 21:07:12 -04:00
Al
ef14aa2b7e [osm] Replacing escape chars at write time as there's no quoting, adding building key to venue training data 2015-08-13 19:30:44 -04:00
Al
9125f07af0 [polygons] Separating out simplify polygon into a method in RTree index 2015-08-13 18:43:35 -04:00
Al
46f2c68a69 [osm] Using tsv_no_quote writers in all OSM training data files 2015-08-13 18:40:41 -04:00
Al
88d63c85d2 [utils] no-quote CSV dialect 2015-08-13 18:26:51 -04:00
Al
03febc7e20 [scripts] Better script code aliasing 2015-08-13 18:25:55 -04:00
Al
b54ff95ecc [mv] csv_utils 2015-08-13 18:19:54 -04:00
Al
cf70615850 [transliteration] Doing HTML escapes first in Latin-ASCII transliteration as they may need to be resolved further in subsequent steps 2015-08-11 23:10:55 -04:00
Al
51addec5f2 [fix] check for local CLDR in unicode properties 2015-08-11 20:23:48 -04:00
Al
882e4c2ab8 [fix] ensure CLDR dir 2015-08-11 20:04:42 -04:00
Al
48566bf097 [fix] cldr languages dir 2015-08-11 20:04:25 -04:00
Al
dd391eabe5 [numex] Separating rules from keys for Linux gcc compilation 2015-08-09 01:00:57 -04:00
Al
a5ce1f12dd [fix] stdint header in address expansion rule generation script 2015-08-08 23:28:11 -04:00
Al
1d39916aaa [fix] Fixing warnings in unicode script data 2015-08-02 21:30:54 -06:00
Al
cdb9afddd3 [fix] address training data carriage returns 2015-07-25 00:35:27 -04:00
Al
87566bb6a5 [numex] Adding validation checks for numex JSON 2015-07-24 15:22:07 -04:00
Al
b27af13f8a [expansion] Adding an array of dictionaries to each (phrase, canonical) pair 2015-07-22 20:24:14 -04:00
Al
64a63fdf51 [mv] Moving all repo data files to a resources dir, data is only for runtime files 2015-07-21 18:11:36 -04:00
Al
7f67ed7dc0 [fix] less ambiguous variable name in the generated expansions data file 2015-07-20 02:58:26 -04:00
Al
5cba747a93 [fix] variable name 2015-07-17 03:06:09 -04:00
Al
5e7bb54a5c [polygons] only add language polygons if there's one default language 2015-07-17 02:19:55 -04:00
Al
d5ac816066 [fix] import 2015-07-16 13:33:50 -04:00