Commit Graph

825 Commits

Author SHA1 Message Date
Al
1e2894a665 [fix] normalize place names before adding OSM components, modify components in place, delete keys and use the boundary components if the component is ambiguous 2016-07-21 17:04:57 -04:00
Al
cec0d6f6df [fix] tuple 2016-07-21 17:04:57 -04:00
Al
e198bbf23e [fix] whitespace 2016-07-21 17:04:57 -04:00
Al
b190c88cc1 [fix] state 2016-07-21 17:04:57 -04:00
Al
3c6691d295 [fix] kwargs 2016-07-21 17:04:57 -04:00
Al
4909fa7ee1 [fix] deriving whitespace and state in normalized_place_name, adding all candidate languages to arguments 2016-07-21 17:04:57 -04:00
Al
330394ff51 [fix] raw OSM reverse geocoded components vs. versus normalized version 2016-07-21 17:04:57 -04:00
Al
f7697cf20d [fix] import 2016-07-21 17:04:57 -04:00
Al
e5fdd915d0 [fix] check the first phrase for components and bail if it matches something other than the specified tag 2016-07-21 17:04:57 -04:00
Al
8370a41ec0 [fix] import 2016-07-21 17:04:57 -04:00
Al
651bc32650 [addresses] more thoroughly solving the addr:city='Harlem' issue 2016-07-21 17:04:57 -04:00
Al
5a31b60cbe [addresses] Adding normalized_place_name, a method for separating compound fields like addr:city='New York NY' into simply 'New York', solving the compound phrase problem. Also solves the mislabeled place name problem, causing the system to ignore the user tag and fall back on reverse geocoded components in cases e.g. where addr:city='Harlem', which is a known neighborhood but not a city when reverse geocoded. A few other refactors for expanded address components 2016-07-21 17:04:57 -04:00
Al
52246e0cd0 [formatting] Defining some of the new tag names in AddressFormatter as well as insert_component which reparses the address formatter template and inserts a given components, removing it from an existing block if necessary 2016-07-21 17:04:57 -04:00
Al
b22fb669b9 [aliases] Adding get method for aliases 2016-07-21 17:04:57 -04:00
Al
c84f50e227 [aliases] packaging up field aliasing 2016-07-21 17:04:57 -04:00
Al
b9ee3be806 [phrases] Using simple string encoding/decoding for default serialize/deserialize in PhraseFilter base class 2016-07-21 17:04:57 -04:00
Al
f00e425891 [osm] Adding parse_osm_number_range for addr:flats and addr:unit 2016-07-21 17:04:57 -04:00
Al
5160a05c5d [fix] typo 2016-07-21 17:04:57 -04:00
Al
8a83634997 [states] Moving state abbreviations config to YAML 2016-07-21 17:04:57 -04:00
Al
771a360a85 [phrases] Using safe_encode/safe_decode as default trie serializer/deserializer 2016-07-21 17:04:57 -04:00
Al
8ae524005a [fix] import 2016-07-21 17:04:57 -04:00
Al
55717f1060 [fix] file encoding 2016-07-21 17:04:57 -04:00
Al
68b70c351b [fix] /postal.text.normalize/geodata.text.normalize/ 2016-07-21 17:04:57 -04:00
Al
f4e6a405e1 [polygons] Moving neighborhoods reverse geocoder to match the naming convention, adding coding: utf-8 2016-07-21 17:04:57 -04:00
Al
4a2d266230 [phrases] adding __init__ to base PhraseFilter 2016-07-21 17:04:57 -04:00
Al
c8ea12e1eb [osm] Adding place=city/town/village/hamlet/municipality to admin borders data set 2016-07-21 17:04:57 -04:00
Al
3b4d3090cd [fix] polygons crossing the international date line 2016-07-21 17:04:57 -04:00
Al
99e634aaba [fix] some weirdness with the dateline and polygons that have a longitude of exactly 180.0 2016-07-21 17:04:57 -04:00
Al
a24fe03b81 [categories] Category query fragment generator. Given a language, key and value, and a flag for plurals, returns a tuple of (category_phrase, preposition, add_place_name) 2016-07-21 17:04:57 -04:00
Al
4933303cec [categories] Config for looking up category-related phrases given a language + OSM key and value (amenity=restaurant, natural=waterfall, etc.) 2016-07-21 17:04:57 -04:00
Al
fa99b4ce77 [addresses] wrapping up some of the functionality from OSM formatter to be used in on an arbitrary address component dictionary 2016-07-21 17:04:57 -04:00
Al
a94debc4ed [osm] addr:place can be used for street name, expanded building polygon definitions, fixing boundary polygons 2016-07-21 17:04:57 -04:00
Al
35c2fee3e9 [fix] file encoding 2016-07-21 17:04:57 -04:00
Al
3e9206f223 [fix] __init__.py 2016-07-21 17:04:57 -04:00
Al
0631d0a27d [points] Adding single typed array for points index 2016-07-21 17:04:57 -04:00
Al
cdf8829942 [fix] no longer requiring argv for unicode_properties script 2016-07-21 17:04:57 -04:00
Al
a0e6a828c9 [languages] Adding country_and_languages to the language rtree itself 2016-07-21 17:04:57 -04:00
Al
6703da8fc3 [fix] languages and disambiguation do initialization by default 2016-07-21 17:04:57 -04:00
Al
ee1aa564c4 [normalization] normalize tokens should not replace digits by default 2016-07-21 17:04:57 -04:00
Al
3a9ac9d96f [fix] six.u 2016-07-21 17:04:57 -04:00
Al
49ac3dc553 [disambiguation] Adding best_country_and_language 2016-07-21 17:04:57 -04:00
Al
7b42e52c6a [fix] token_types.PHRASE 2016-07-21 17:04:57 -04:00
Al
e21b793b03 [polygons] Adding ISO3166 alpha 2/3 codes to OSM polygons index 2016-07-21 17:04:57 -04:00
Al
7e5ecb30cf [addresses] sample_alphabet (Zipfian) in PO box rather than a uniform choice 2016-07-21 17:04:57 -04:00
Al
3845c58ca3 [points] Adding load method for point reverse geocoding 2016-07-21 17:04:57 -04:00
Al
c506649252 [fix] languages_intialized 2016-07-21 17:04:57 -04:00
Al
1fd4fbb7a2 [normalization] Adding default token options for numbers so we split alpha from numeric tokens and don't normalize digits 2016-07-21 17:04:57 -04:00
Al
3d765e9eca [addresses] Fixing direction_probability, adding ability to have phrases which only apply to numbers, adding the possibility of null phrases to non-numeric "numbers" e.g. A-Z, etc. 2016-07-21 17:04:57 -04:00
Al
ac00f294c0 [requirements] Adding numpy to Python repo's requirements (only needed for building libpostal, not for using it) 2016-07-21 17:04:57 -04:00
Al
03704fff6a [intersections] Lower memory version of intersection freader 2016-07-21 17:04:57 -04:00