Commit Graph

664 Commits

Author SHA1 Message Date
Al
5a31b60cbe [addresses] Adding normalized_place_name, a method for separating compound fields like addr:city='New York NY' into simply 'New York', solving the compound phrase problem. Also solves the mislabeled place name problem, causing the system to ignore the user tag and fall back on reverse geocoded components in cases e.g. where addr:city='Harlem', which is a known neighborhood but not a city when reverse geocoded. A few other refactors for expanded address components 2016-07-21 17:04:57 -04:00
Al
52246e0cd0 [formatting] Defining some of the new tag names in AddressFormatter as well as insert_component which reparses the address formatter template and inserts a given components, removing it from an existing block if necessary 2016-07-21 17:04:57 -04:00
Al
b22fb669b9 [aliases] Adding get method for aliases 2016-07-21 17:04:57 -04:00
Al
c84f50e227 [aliases] packaging up field aliasing 2016-07-21 17:04:57 -04:00
Al
b9ee3be806 [phrases] Using simple string encoding/decoding for default serialize/deserialize in PhraseFilter base class 2016-07-21 17:04:57 -04:00
Al
f00e425891 [osm] Adding parse_osm_number_range for addr:flats and addr:unit 2016-07-21 17:04:57 -04:00
Al
5160a05c5d [fix] typo 2016-07-21 17:04:57 -04:00
Al
8a83634997 [states] Moving state abbreviations config to YAML 2016-07-21 17:04:57 -04:00
Al
771a360a85 [phrases] Using safe_encode/safe_decode as default trie serializer/deserializer 2016-07-21 17:04:57 -04:00
Al
8ae524005a [fix] import 2016-07-21 17:04:57 -04:00
Al
55717f1060 [fix] file encoding 2016-07-21 17:04:57 -04:00
Al
68b70c351b [fix] /postal.text.normalize/geodata.text.normalize/ 2016-07-21 17:04:57 -04:00
Al
f4e6a405e1 [polygons] Moving neighborhoods reverse geocoder to match the naming convention, adding coding: utf-8 2016-07-21 17:04:57 -04:00
Al
4a2d266230 [phrases] adding __init__ to base PhraseFilter 2016-07-21 17:04:57 -04:00
Al
c8ea12e1eb [osm] Adding place=city/town/village/hamlet/municipality to admin borders data set 2016-07-21 17:04:57 -04:00
Al
3b4d3090cd [fix] polygons crossing the international date line 2016-07-21 17:04:57 -04:00
Al
99e634aaba [fix] some weirdness with the dateline and polygons that have a longitude of exactly 180.0 2016-07-21 17:04:57 -04:00
Al
a24fe03b81 [categories] Category query fragment generator. Given a language, key and value, and a flag for plurals, returns a tuple of (category_phrase, preposition, add_place_name) 2016-07-21 17:04:57 -04:00
Al
4933303cec [categories] Config for looking up category-related phrases given a language + OSM key and value (amenity=restaurant, natural=waterfall, etc.) 2016-07-21 17:04:57 -04:00
Al
fa99b4ce77 [addresses] wrapping up some of the functionality from OSM formatter to be used in on an arbitrary address component dictionary 2016-07-21 17:04:57 -04:00
Al
a94debc4ed [osm] addr:place can be used for street name, expanded building polygon definitions, fixing boundary polygons 2016-07-21 17:04:57 -04:00
Al
35c2fee3e9 [fix] file encoding 2016-07-21 17:04:57 -04:00
Al
3e9206f223 [fix] __init__.py 2016-07-21 17:04:57 -04:00
Al
0631d0a27d [points] Adding single typed array for points index 2016-07-21 17:04:57 -04:00
Al
cdf8829942 [fix] no longer requiring argv for unicode_properties script 2016-07-21 17:04:57 -04:00
Al
a0e6a828c9 [languages] Adding country_and_languages to the language rtree itself 2016-07-21 17:04:57 -04:00
Al
6703da8fc3 [fix] languages and disambiguation do initialization by default 2016-07-21 17:04:57 -04:00
Al
ee1aa564c4 [normalization] normalize tokens should not replace digits by default 2016-07-21 17:04:57 -04:00
Al
3a9ac9d96f [fix] six.u 2016-07-21 17:04:57 -04:00
Al
49ac3dc553 [disambiguation] Adding best_country_and_language 2016-07-21 17:04:57 -04:00
Al
7b42e52c6a [fix] token_types.PHRASE 2016-07-21 17:04:57 -04:00
Al
e21b793b03 [polygons] Adding ISO3166 alpha 2/3 codes to OSM polygons index 2016-07-21 17:04:57 -04:00
Al
7e5ecb30cf [addresses] sample_alphabet (Zipfian) in PO box rather than a uniform choice 2016-07-21 17:04:57 -04:00
Al
3845c58ca3 [points] Adding load method for point reverse geocoding 2016-07-21 17:04:57 -04:00
Al
c506649252 [fix] languages_intialized 2016-07-21 17:04:57 -04:00
Al
1fd4fbb7a2 [normalization] Adding default token options for numbers so we split alpha from numeric tokens and don't normalize digits 2016-07-21 17:04:57 -04:00
Al
3d765e9eca [addresses] Fixing direction_probability, adding ability to have phrases which only apply to numbers, adding the possibility of null phrases to non-numeric "numbers" e.g. A-Z, etc. 2016-07-21 17:04:57 -04:00
Al
ac00f294c0 [requirements] Adding numpy to Python repo's requirements (only needed for building libpostal, not for using it) 2016-07-21 17:04:57 -04:00
Al
03704fff6a [intersections] Lower memory version of intersection freader 2016-07-21 17:04:57 -04:00
Al
620f0594aa [points] haversine distance in a different method 2016-07-21 17:04:57 -04:00
Al
d5dc34ec1d [gazetteers] moving PHRASE to a token type 2016-07-21 17:04:57 -04:00
Al
04a5a9e611 [fix] Removing YAML inheritance as it doesn't merge nested dictionaries 2016-07-21 17:04:57 -04:00
Al
f3bbe2ee74 [fix] file rename 2016-07-21 17:04:57 -04:00
Al
9f37a26a6d [points] Adding point reverse geocoding index 2016-07-21 17:04:57 -04:00
Al
9977a7a254 [mv] Moving osm_admin_boundaries to just admin_boundaries 2016-07-21 17:04:57 -04:00
Al
37747709ee [addresses] Using YAML inheritance instead of baking it into the config parser 2016-07-21 17:04:57 -04:00
Al
cd10951afb [addresses] Generalizing the functions used for address configs so they can be reused for per-country OSM configs, etc. 2016-07-21 17:04:57 -04:00
Al
79368f3f02 [intersections] Intersections generator for OSM 2016-07-21 17:04:57 -04:00
Al
799bbe4912 [neighborhoods] Moving neighborhoods index to its own package 2016-07-21 17:04:57 -04:00
Al
8aac200d74 [addresses] config for phrases around postcodes like CP in Spanish 2016-07-21 17:04:57 -04:00