Commit Graph

4641 Commits

Author SHA1 Message Date
Al
9a93e95938 [api] removing geodb from setup functions 2017-02-10 01:02:52 -05:00
Al
ff245d74f8 [parser] building an index of postal codes and their valid admin contexts (city, state, country, etc.) during training e.g. "11216" => ["brooklyn", "ny"]. Postal code phrases like CP in Spanish are removed when constructing the index. 2017-02-10 00:50:48 -05:00
Al
40d1b26e12 [openaddresses] Henderson County, KY 2017-02-09 16:23:43 -05:00
Al
d18c68918d [openaddresses] Kern County, CA 2017-02-09 15:23:16 -05:00
Al
598f15cad8 [openaddresses] city of Vilnius, Lithuania 2017-02-09 15:19:34 -05:00
Al
fa3405fe4d [openaddresses] Scott County, KY 2017-02-09 15:17:20 -05:00
Al
f00625029b [openaddresses] Ajax, ON 2017-02-09 15:16:39 -05:00
Al
ce5826928b [openaddresses] add Saskatoon, SK 2017-02-09 15:11:56 -05:00
Al
1aacb5bccc Merge branch 'master' into parser-data 2017-02-09 15:09:28 -05:00
Al
ea168279bd [fix] free json-encoded string in parser client output 2017-02-09 14:34:15 -05:00
Al
38c6c26146 [fix] freeing normalized string in address_parser_parse 2017-02-09 14:33:13 -05:00
Al
8aa3749cfb [utils] some convenience functions for generic hashtables (incr, get, etc) 2017-02-08 19:01:13 -05:00
Al
a6844c8ec1 [parser] structural changes for postal codes index 2017-02-08 18:52:45 -05:00
Al
7a360f4211 [osm] addr:postcode can be all over the place in OSM. Start with postcodes containing commas or semicolons. If addr:postcode (on address of building) contains either, iterate over the values and pick the first one that matches a postcode validation regex for that country 2017-02-08 16:13:29 -05:00
Al
97ccbef807 [openaddresses] adding Lincoln County, WY 2017-02-08 15:54:11 -05:00
Al
30fba16141 [openaddresses] adding Wuppertal, Germany, Marietta GA, Salem OR, and Atlanta 2017-02-08 15:41:17 -05:00
Al
6e4f641743 [phrases] adding token_phrase_memberships to trie_search for reuse 2017-02-08 01:59:39 -05:00
Al
ae35da8d17 [fix] uninitialized var 2017-02-08 01:58:53 -05:00
Al
3a95af104b [openaddresses] remove add_osm_boundaries from City of Anaheim 2017-02-07 14:37:15 -05:00
Al
dbf7242ea0 [fix] /cls/self/ 2017-02-04 19:12:49 -05:00
Al
35effe4b0b [openaddresses] adding state of Thüringen, Germany 2017-02-04 18:00:13 -05:00
Al
af06270896 [openaddresses] adding ignore regexes for US counties where we use the unit, using non_numeric_units in every case 2017-02-04 15:48:00 -05:00
Al
c600f05f06 [openaddresses] adding Czech Republic to the street not required set 2017-02-04 15:30:46 -05:00
Al
0169448a4d [addresses] adding Central European city district regexes (e.g. Praha 1, Budapest IV, etc.) to country-specific cleanup 2017-02-03 20:54:23 -05:00
Al
1b6263a6e7 [openaddresses] add postcode field to NY statewide 2017-02-02 15:00:34 -05:00
Al
990ce176aa [openaddresses] add language and modern city name to Dnipro, Ukraine 2017-02-02 14:02:59 -05:00
Al
c95d5db290 [openaddresses] ignore US postcodes that are 1-4 digits, usually typos. Reformat where needed 2017-02-02 14:02:06 -05:00
Al
85f03184d5 [openaddresses] moving postcode fixes before validation. Adding regex for validating Russian house numbers in the Ukraine 2017-02-02 11:21:00 -05:00
Al
f6e9c5f709 [openaddresses] adding postcode length=5 (so leading zeros get captured if the field was an integer) for Germany, France, Italy, and Mexico. Adding validation to Volgograd oblast 2017-02-02 11:16:38 -05:00
Al
4fbd99d2c8 [openaddresses] add Tacoma, WA 2017-02-01 20:30:21 -05:00
Al
d4d3407f2c [openaddresses] adding Vernon BC, Churchill County NV, and some of the new Georgia sources 2017-01-31 16:01:39 -05:00
Al
12146b6eeb [openaddresses] adding Nacka, Sweden 2017-01-30 02:41:30 -05:00
Al
0380f565d2 [parser] shorter first word feature 2017-01-29 22:10:28 -05:00
Al
1fbdd964b3 [openaddresses] add languages for China and Russia data sets so the validators kick in 2017-01-28 02:15:00 -05:00
Al
12bc18f74b [openaddresses] fix Chinese house number validation 2017-01-28 02:03:19 -05:00
Al
2b349ef8a8 [fix] nevermind, needed to do the Spanish-language street names before validation (simple numeric names like \"8\" needs to be prefixed with \"Calle\" or they'll fail validation) 2017-01-28 01:08:10 -05:00
Al
dcacbece8f [openaddresses] adding city_district for Wuhan, China 2017-01-28 01:03:11 -05:00
Al
2953759321 [openaddresses] formatting Chinese house number (with annex adding a second number potentially) and adding Spanish street names after the language is known by reverse geocoding 2017-01-28 01:01:26 -05:00
Al
c9417436f7 [openaddresses] allowing a single character boundary name in ideographic languages 2017-01-27 23:38:03 -05:00
Al
c798f4a83b [places] always include suburb in Japan as it functions as the street 2017-01-27 21:22:12 -05:00
Al
72881ad315 [fix] conditional + var name 2017-01-27 19:20:41 -05:00
Al
987609ee8e [fix] var name 2017-01-27 18:46:58 -05:00
Al
cd1875d077 [fix] import 2017-01-27 18:35:43 -05:00
Al
01d6d47b08 [osm] removing addr:place mapping to road as it's usually a village in post-Soviet states, etc. Can handle it down the road 2017-01-27 13:54:08 -05:00
Al
11345bf2bf [osm] using new constants in OSM formatting as well 2017-01-27 13:53:00 -05:00
Al
b25f5f26ae [openaddresses] not requiring street name in former Soviet countries (may be village + house_number). Only allowing address-only if street is present 2017-01-27 13:17:07 -05:00
Al
82fb5c1dca [countries] moving country constants to a separate module 2017-01-27 13:15:36 -05:00
Al
92163eeac5 Merge branch 'parser-data' of https://github.com/openvenues/libpostal into parser-data 2017-01-27 03:59:07 -05:00
Al
4d5415ee49 [openaddresses] adding Wuhan, China 2017-01-27 03:08:39 -05:00
Al
aefa8dbd25 [openaddresses] adding Venice 2017-01-27 03:04:56 -05:00