Al
|
8030b235e6
|
[languages] Changing the definition in script languages so only languages that appear on street signs will be used
|
2016-01-17 22:03:41 -05:00 |
|
Al
|
3d7dd8966e
|
[languages] Using unicode script in language disambiguation in addition to dictionaries. Eliminating dependency on address_normalizer
|
2016-01-17 18:28:28 -05:00 |
|
Al
|
fa32eacdd1
|
[phrases] Adding Python phrase filter from address_normalizer until a Python wrapper around libpostal's trie_search is available
|
2016-01-17 15:45:02 -05:00 |
|
Al
|
f79a3c5bf4
|
[osm/polygons] Allowing polygons that GEOS claims are invalid in OSM polygon index (there were some glaring omissions from the index like the polygons for the UK or Berlin). For some reason .buffer(0) creates weird multipolygons that no longer contain their centroids, etc. and aren't useful in reverese geocoding
|
2016-01-17 15:43:21 -05:00 |
|
Al
|
04f251c1cc
|
[polygons] Don't call fix_polygon (force polygon validity) by default
|
2016-01-16 21:21:27 -05:00 |
|
Al
|
19a5541a85
|
[polygons/osm] append polygon nodes by vertices that connect to each other
|
2016-01-16 21:20:49 -05:00 |
|
Al
|
58e53cab1c
|
[scripts] Adding the tokenize/normalize wrappers directly into the internal geodata package so pypostal can be maintained in an independent repo
|
2016-01-12 13:29:31 -05:00 |
|
Al
|
e9e05bb929
|
[transliteration] Distinguishing between variables with numbers and backreferences in transliteration rules
|
2015-12-23 13:07:44 -05:00 |
|
Al
|
e55ff54be1
|
[fix] Adding Korean-Latin-BGN to excluded transliterators
|
2015-12-21 16:24:50 -05:00 |
|
Al
|
682c316775
|
[transliteration] Removing Korean-Latin-BGN, not a great transliterator and AFAICT, ICU doesn't use it either
|
2015-12-21 12:45:45 -05:00 |
|
Al
|
ccf509edb1
|
[fix] update to control characters for generating the transliteration rules
|
2015-12-20 15:40:38 -05:00 |
|
Al
|
b2a944830a
|
[transliteration] Making sure the Python script to generate transliteration data works on the new CLDR format
|
2015-12-19 00:34:30 -05:00 |
|
Al
|
1d288954d7
|
[osm] Fixing an issue in the training data with house numbers in OSM (seen mostly in Uruguay) where a comma separated list of house numbers is entered.
|
2015-12-10 18:46:28 -05:00 |
|
Al
|
779298360c
|
[osm] In cases with more than one official language and where the address language can be determined, use it for looking up language-specific OSM polygons
|
2015-12-09 01:00:59 -05:00 |
|
Al
|
aeb72d7d26
|
[osm] Randomly select up to n components for state_district OSM boundaries. For all other fields select one name at random
|
2015-12-09 00:20:20 -05:00 |
|
Al
|
69a469d9d3
|
[osm] Choosing a language at random in countries with multilingual addresses for the parser training data so we get some monolingual examples
|
2015-12-08 20:38:32 -05:00 |
|
Al
|
35db855819
|
[fix] canonical index in address expansion data, should be -1 for all canonical phrases
|
2015-12-08 15:09:51 -05:00 |
|
Al
|
f8a3081d0f
|
[fix] city name in OSM formatting
|
2015-12-07 02:33:12 -05:00 |
|
Al
|
b25a738000
|
[osm] Doing more deduping in the OSM training data to avoid confusing the parser when city, state, district all have the same name
|
2015-12-06 16:14:02 -05:00 |
|
Al
|
dd8f8b4d7b
|
[fix] prefix/suffix regexes
|
2015-12-05 18:41:22 -05:00 |
|
Al
|
5fcb6d2c30
|
[fix] typo
|
2015-12-05 16:23:58 -05:00 |
|
Al
|
3a7ba0288f
|
[fix] .get
|
2015-12-05 16:13:15 -05:00 |
|
Al
|
c92a6de477
|
[fix] name
|
2015-12-05 15:49:50 -05:00 |
|
Al
|
2a4210f93f
|
[osm] Stripping standard city prefixes/suffies e.g. Township of
|
2015-12-05 15:42:22 -05:00 |
|
Al
|
f41158b8b3
|
[osm] Avoid using the alternate name (e.g. Brooklyn instead of Kings County) when it is the same as city
|
2015-12-05 14:21:07 -05:00 |
|
Al
|
7c26317903
|
[fix] osm components
|
2015-12-03 19:30:15 -05:00 |
|
Al
|
42a8890652
|
[osm] Only removing local language city if there are prior components from OSM
|
2015-12-03 19:11:03 -05:00 |
|
Al
|
ab0a4e622d
|
[formatting] Switching back over to OpenCageData
|
2015-12-03 18:03:21 -05:00 |
|
Al
|
5af95ee613
|
[osm] Adding GeoNames abbreviated city names in a small percentage of cases to get variations like NYC, BK, SF, etc. in the training data
|
2015-12-03 18:00:05 -05:00 |
|
Al
|
218361f43f
|
[osm] Removing multilinestring boundaries from OSM polygon index (often partial boundaries e.g. France-Germany)
|
2015-12-03 00:51:09 -05:00 |
|
Al
|
8484d4fffd
|
[fix] venue names should be removed probabilistically in the training data, giving neighborhoods a slightly better chance of being included
|
2015-11-30 23:28:12 -05:00 |
|
Al
|
6ef40c1769
|
[fix] dupe checking
|
2015-11-30 18:43:11 -05:00 |
|
Al
|
af170de019
|
[fix] Smaller probabilities on adding neighborhoods and admin polygons, eliminating duplicates on the row level
|
2015-11-30 18:35:31 -05:00 |
|
Al
|
621fd79002
|
[fix] var
|
2015-11-30 18:20:26 -05:00 |
|
Al
|
b430fb7657
|
[osm/formatting] Adding pick random name logic to neighborhoods as well, getting rid of drop probabilities as they're covered elsewhere, adding several forms of venue names to the training data
|
2015-11-30 18:10:18 -05:00 |
|
Al
|
d4b6450f19
|
[formatting] Not applying template replacements from address formatting by default
|
2015-11-30 16:11:13 -05:00 |
|
Al
|
839a12b212
|
[osm/formatting] Changing drop probabilities and doing it in random order
|
2015-11-30 15:27:35 -05:00 |
|
Al
|
89677d94a3
|
[parsing] Initial commit of the address parser, training/testing, feature function, I/O
|
2015-11-30 14:48:13 -05:00 |
|
Al
|
9a8ba14887
|
[osm/formatting] Adding per-field drop probabilities to OSM training data to make some fields more likely to be dropped, although it might create more training data
|
2015-11-30 11:10:12 -05:00 |
|
Al
|
c8e4602d4c
|
[fix] Neighborhoods reverse geocoder discriminates between OSM matched with Zetashapes and OSM matched with Quattroshapes
|
2015-11-30 10:59:50 -05:00 |
|
Al
|
15d9e00121
|
[osm/formatting] Adding in more ISO alpha-3 codes for countries in the training data
|
2015-11-28 14:08:07 -05:00 |
|
Al
|
66778737ff
|
[fix] non-local language states
|
2015-11-28 13:48:59 -05:00 |
|
Al
|
69ba631dc9
|
[docs] updating params in OSM training data docs
|
2015-11-28 01:09:14 -05:00 |
|
Al
|
3cd1fee89d
|
[fix] KeyError
|
2015-11-27 14:40:11 -05:00 |
|
Al
|
a77bc03977
|
[fix] language
|
2015-11-27 14:24:32 -05:00 |
|
Al
|
38d4e2d67a
|
[fix] cities
|
2015-11-27 14:05:53 -05:00 |
|
Al
|
3cf98770e3
|
[fix] var name
|
2015-11-27 13:54:38 -05:00 |
|
Al
|
2e0f35b13a
|
[fix] key checks for Quattroshapes cities, removing city in non-local language case
|
2015-11-27 13:45:51 -05:00 |
|
Al
|
105ba313c5
|
[fix] var name
|
2015-11-27 12:00:11 -05:00 |
|
Al
|
3eea355352
|
[fix] argument order
|
2015-11-27 11:47:39 -05:00 |
|