Al
|
e55ff54be1
|
[fix] Adding Korean-Latin-BGN to excluded transliterators
|
2015-12-21 16:24:50 -05:00 |
|
Al
|
682c316775
|
[transliteration] Removing Korean-Latin-BGN, not a great transliterator and AFAICT, ICU doesn't use it either
|
2015-12-21 12:45:45 -05:00 |
|
Al
|
ccf509edb1
|
[fix] update to control characters for generating the transliteration rules
|
2015-12-20 15:40:38 -05:00 |
|
Al
|
b2a944830a
|
[transliteration] Making sure the Python script to generate transliteration data works on the new CLDR format
|
2015-12-19 00:34:30 -05:00 |
|
Al
|
1d288954d7
|
[osm] Fixing an issue in the training data with house numbers in OSM (seen mostly in Uruguay) where a comma separated list of house numbers is entered.
|
2015-12-10 18:46:28 -05:00 |
|
Al
|
779298360c
|
[osm] In cases with more than one official language and where the address language can be determined, use it for looking up language-specific OSM polygons
|
2015-12-09 01:00:59 -05:00 |
|
Al
|
aeb72d7d26
|
[osm] Randomly select up to n components for state_district OSM boundaries. For all other fields select one name at random
|
2015-12-09 00:20:20 -05:00 |
|
Al
|
69a469d9d3
|
[osm] Choosing a language at random in countries with multilingual addresses for the parser training data so we get some monolingual examples
|
2015-12-08 20:38:32 -05:00 |
|
Al
|
35db855819
|
[fix] canonical index in address expansion data, should be -1 for all canonical phrases
|
2015-12-08 15:09:51 -05:00 |
|
Al
|
f8a3081d0f
|
[fix] city name in OSM formatting
|
2015-12-07 02:33:12 -05:00 |
|
Al
|
b25a738000
|
[osm] Doing more deduping in the OSM training data to avoid confusing the parser when city, state, district all have the same name
|
2015-12-06 16:14:02 -05:00 |
|
Al
|
dd8f8b4d7b
|
[fix] prefix/suffix regexes
|
2015-12-05 18:41:22 -05:00 |
|
Al
|
5fcb6d2c30
|
[fix] typo
|
2015-12-05 16:23:58 -05:00 |
|
Al
|
3a7ba0288f
|
[fix] .get
|
2015-12-05 16:13:15 -05:00 |
|
Al
|
c92a6de477
|
[fix] name
|
2015-12-05 15:49:50 -05:00 |
|
Al
|
2a4210f93f
|
[osm] Stripping standard city prefixes/suffies e.g. Township of
|
2015-12-05 15:42:22 -05:00 |
|
Al
|
f41158b8b3
|
[osm] Avoid using the alternate name (e.g. Brooklyn instead of Kings County) when it is the same as city
|
2015-12-05 14:21:07 -05:00 |
|
Al
|
7c26317903
|
[fix] osm components
|
2015-12-03 19:30:15 -05:00 |
|
Al
|
42a8890652
|
[osm] Only removing local language city if there are prior components from OSM
|
2015-12-03 19:11:03 -05:00 |
|
Al
|
ab0a4e622d
|
[formatting] Switching back over to OpenCageData
|
2015-12-03 18:03:21 -05:00 |
|
Al
|
5af95ee613
|
[osm] Adding GeoNames abbreviated city names in a small percentage of cases to get variations like NYC, BK, SF, etc. in the training data
|
2015-12-03 18:00:05 -05:00 |
|
Al
|
218361f43f
|
[osm] Removing multilinestring boundaries from OSM polygon index (often partial boundaries e.g. France-Germany)
|
2015-12-03 00:51:09 -05:00 |
|
Al
|
8484d4fffd
|
[fix] venue names should be removed probabilistically in the training data, giving neighborhoods a slightly better chance of being included
|
2015-11-30 23:28:12 -05:00 |
|
Al
|
6ef40c1769
|
[fix] dupe checking
|
2015-11-30 18:43:11 -05:00 |
|
Al
|
af170de019
|
[fix] Smaller probabilities on adding neighborhoods and admin polygons, eliminating duplicates on the row level
|
2015-11-30 18:35:31 -05:00 |
|
Al
|
621fd79002
|
[fix] var
|
2015-11-30 18:20:26 -05:00 |
|
Al
|
b430fb7657
|
[osm/formatting] Adding pick random name logic to neighborhoods as well, getting rid of drop probabilities as they're covered elsewhere, adding several forms of venue names to the training data
|
2015-11-30 18:10:18 -05:00 |
|
Al
|
d4b6450f19
|
[formatting] Not applying template replacements from address formatting by default
|
2015-11-30 16:11:13 -05:00 |
|
Al
|
839a12b212
|
[osm/formatting] Changing drop probabilities and doing it in random order
|
2015-11-30 15:27:35 -05:00 |
|
Al
|
89677d94a3
|
[parsing] Initial commit of the address parser, training/testing, feature function, I/O
|
2015-11-30 14:48:13 -05:00 |
|
Al
|
9a8ba14887
|
[osm/formatting] Adding per-field drop probabilities to OSM training data to make some fields more likely to be dropped, although it might create more training data
|
2015-11-30 11:10:12 -05:00 |
|
Al
|
c8e4602d4c
|
[fix] Neighborhoods reverse geocoder discriminates between OSM matched with Zetashapes and OSM matched with Quattroshapes
|
2015-11-30 10:59:50 -05:00 |
|
Al
|
15d9e00121
|
[osm/formatting] Adding in more ISO alpha-3 codes for countries in the training data
|
2015-11-28 14:08:07 -05:00 |
|
Al
|
66778737ff
|
[fix] non-local language states
|
2015-11-28 13:48:59 -05:00 |
|
Al
|
69ba631dc9
|
[docs] updating params in OSM training data docs
|
2015-11-28 01:09:14 -05:00 |
|
Al
|
3cd1fee89d
|
[fix] KeyError
|
2015-11-27 14:40:11 -05:00 |
|
Al
|
a77bc03977
|
[fix] language
|
2015-11-27 14:24:32 -05:00 |
|
Al
|
38d4e2d67a
|
[fix] cities
|
2015-11-27 14:05:53 -05:00 |
|
Al
|
3cf98770e3
|
[fix] var name
|
2015-11-27 13:54:38 -05:00 |
|
Al
|
2e0f35b13a
|
[fix] key checks for Quattroshapes cities, removing city in non-local language case
|
2015-11-27 13:45:51 -05:00 |
|
Al
|
105ba313c5
|
[fix] var name
|
2015-11-27 12:00:11 -05:00 |
|
Al
|
3eea355352
|
[fix] argument order
|
2015-11-27 11:47:39 -05:00 |
|
Al
|
51f6a82727
|
[fix] import again
|
2015-11-27 11:38:40 -05:00 |
|
Al
|
644eeb74c6
|
[fix] import
|
2015-11-27 11:17:53 -05:00 |
|
Al
|
2830986073
|
[osm/formatting] Adding in cities from Quattroshapes/GeoNames in the case of non-local languages or in general with a small random probability
|
2015-11-27 11:09:12 -05:00 |
|
Al
|
b0667d0032
|
[fix] only care about levels in Quattroshapes index, not Zetashapes
|
2015-11-26 23:45:50 -05:00 |
|
Al
|
0eb0042826
|
[fix] Same in neighborhoods reverse geocoder lookups
|
2015-11-26 14:17:17 -05:00 |
|
Al
|
4170f6e9e3
|
[fix] same options for geohash-based index
|
2015-11-26 14:14:53 -05:00 |
|
Al
|
4cff1f8a9d
|
[fix] Quattroshapes neighborhoods index uses geohashes for slightly better coverage
|
2015-11-26 12:45:54 -05:00 |
|
Al
|
98d8054a2b
|
[polygons/quattroshapes] Converting Quattroshapes lookups to an R-tree index
|
2015-11-25 19:37:57 -05:00 |
|