Al
|
5cabd9b4f7
|
[fix] country languages in OpenAddresses
|
2016-10-24 17:35:39 -04:00 |
|
Al
|
35d3d8cc73
|
[openaddresses] countries are known a priori, so if the boundaries don't quite line up with OSM, use the country from the path
|
2016-10-23 19:50:54 -04:00 |
|
Al
|
f429bea15b
|
[fix] subtract abs value
|
2016-10-23 01:11:09 -04:00 |
|
Al
|
1658c425c5
|
[fix] clear country cache only at each new country, not each file
|
2016-10-23 00:57:52 -04:00 |
|
Al
|
7199ff17e0
|
[fix] truncate postcodes that are longer than specified length
|
2016-10-23 00:52:24 -04:00 |
|
Al
|
889e914dfc
|
[openaddresses] clear all polygon caches
|
2016-10-23 00:11:54 -04:00 |
|
Al
|
0fd431a9d2
|
[fix] abs
|
2016-10-22 23:55:30 -04:00 |
|
Al
|
ec54d3de35
|
[fix] don't convert number to int/float in numeric_phrase (chops leading zeros)
|
2016-10-22 23:49:58 -04:00 |
|
Al
|
63edd53fb3
|
[openaddresses] adding clear_cache method to clear the LRU cache for point-in-polygon indices and using it in OpenAddresses import since it heavily reuses polygons and only for the current file
|
2016-10-22 20:28:59 -04:00 |
|
Al
|
d51a1d6196
|
[addresses] doing hyphenation for existing components in component expansion (i.e. OSM training data)
|
2016-10-21 22:02:19 -04:00 |
|
Al
|
2a355b2cf8
|
[openaddresses] adding address only 10% of the time in OpenAddresses
|
2016-10-20 23:57:30 -04:00 |
|
Al
|
d965ea9371
|
[openaddresses] adding hyphenation/dehyphenation to the OpenAddresses formatter
|
2016-10-20 20:55:17 -04:00 |
|
Al
|
00ebdfed7f
|
[osm] adding alt_place_names to the shared formatting class AddressComponents and making them classmethods
|
2016-10-20 20:41:22 -04:00 |
|
Al
|
d9bc465c82
|
[osm] parsing out semicolon-delimited postal codes from OSM in countries like Poland that use hyphen delimited postcodes without treating them as number ranges
|
2016-10-19 17:46:42 -04:00 |
|
Al
|
ec77a247fa
|
[fix] just ignore records without the "name" tag
|
2016-10-19 13:36:15 -04:00 |
|
Al
|
61078eded9
|
[fix] checking for dictionary key
|
2016-10-19 13:34:13 -04:00 |
|
Al
|
c2b73307de
|
[fix] parens
|
2016-10-19 13:29:56 -04:00 |
|
Al
|
f639151698
|
[osm] checking for non-admin_center nodes which are part of a lower admin level polygon with the same name
|
2016-10-19 13:27:38 -04:00 |
|
Al
|
e380567ac4
|
[osm] adding alt_place_names method which does hyphenation, de-hyphenation and abbreviated toponyms with/without hyphens
|
2016-10-19 02:19:09 -04:00 |
|
Al
|
51afc2619b
|
[fix] only replace whitespace between words, not for instance whitespace around an existing hyphen, and reducing to one space for spaced hyphens
|
2016-10-19 01:24:54 -04:00 |
|
Al
|
e8899eafd6
|
[osm] adding hyphenation/de-hyphenation to OSM admin components
|
2016-10-19 01:00:29 -04:00 |
|
Al
|
98ac232eea
|
[osm] hyphenating and de-hyphenating place names in places training data
|
2016-10-19 00:33:10 -04:00 |
|
Al
|
72e7d3ff5b
|
[addresses/hyphens] adding some methods to hyphenate/dehyphenate place names at random
|
2016-10-18 19:10:31 -04:00 |
|
Al
|
7e007a49ab
|
[osm] removing place=district mapping globally (means city_district in Hungary) and mapping it specifically to state_district/city_district in the places where it's needed
|
2016-10-18 19:02:36 -04:00 |
|
Al
|
d34faf42b8
|
[osm] fix names with pipes in them
|
2016-10-17 02:32:25 -04:00 |
|
Al
|
a796b41d90
|
[geonames] admin codes on geonames/postal_codes tables
|
2016-10-17 00:21:33 -04:00 |
|
Al
|
ff27ee14bb
|
[osm] only add label props if the name property is identical (counterexample, Nottinghamshire's label is listed as West Bridgford, which is really its admin_center)
|
2016-10-16 22:18:52 -04:00 |
|
Al
|
9fb936019a
|
[geoplanet] script to create GeoPlanet postal codes training data
|
2016-10-12 15:05:45 -04:00 |
|
Al
|
1e6a00c573
|
[fix] place in UK that was parented by a postal_code
|
2016-10-12 15:00:33 -04:00 |
|
Al
|
1d25f08b52
|
[expand] adding a function to check if two place names/addresses are equivalent after token normalization (replacing hyphens, deleting final periods, lowercasing, simple transliteration, etc.) and taking into account abbreviations from any specified libpostal dictionaries. In conjunction with place name affixes, useful in data sets like GeoPlanet or GeoNames to determine if a name variant is related to the original or not
|
2016-10-12 14:55:59 -04:00 |
|
Al
|
f8664b0deb
|
[formatting] making regex-based tests during insert_component optional.If exact_order=True, insert the given component directly before/after the reference component, otherwise for components that already exist in the template only need to care about relative position. Adding a method to determine if template language is important for a particular country/language pair.
|
2016-10-12 14:42:34 -04:00 |
|
Al
|
2663b81670
|
[address_formatting] caching parsed templates from pystache yields about a 2.5x speedup per call, should shave off several hours of CPU time for large training sets
|
2016-10-11 15:36:49 -04:00 |
|
Al
|
2314acef1b
|
[geoplanet] bypassing Québec as a county (just city and state)
|
2016-10-11 02:33:27 -04:00 |
|
Al
|
02fc172b5c
|
[geoplanet] abbreviations for UK and NYC, fixing country codes for IM, GG and JE
|
2016-10-11 02:11:26 -04:00 |
|
Al
|
6ff1024c02
|
[fix] null candidate languages
|
2016-10-07 19:49:32 -04:00 |
|
Al
|
30074524d8
|
[fix] return empty list for languages in country_and_languages
|
2016-10-07 18:57:22 -04:00 |
|
Al
|
ff7fec6ed1
|
[osm/polygons] need to include id/type in polygon properties now that they're getting added earlier in the pipeline
|
2016-10-07 01:21:02 -04:00 |
|
Al
|
169a3c3d70
|
[osm] drop postcode as well for address-only format
|
2016-10-07 01:10:16 -04:00 |
|
Al
|
4ff3f50e01
|
[fix] Dublin postcode formatting
|
2016-10-07 01:06:37 -04:00 |
|
Al
|
2e8b6e6a29
|
[fix] args
|
2016-10-07 01:03:22 -04:00 |
|
Al
|
0401a04adb
|
[osm] add address-only formats (sans place tags) for every address as well to better deal handle incomplete queries where location is expected to be inferred by the geocoder, etc.
|
2016-10-07 00:59:52 -04:00 |
|
Al
|
ed26d8e398
|
[geoplanet] a few more GeoPlanet fixes for LocalAdmins in LU and CH
|
2016-10-07 00:34:57 -04:00 |
|
Al
|
ecd71ee10d
|
[fix] var name
|
2016-10-06 15:36:51 -04:00 |
|
Al
|
c44e6280b4
|
[geoplanet] Setting postal codes connected to non-admin features to parent/grandparent features. Setting postal codes connected to unitary authorities in the UK to their respective towns
|
2016-10-06 14:07:01 -04:00 |
|
Al
|
aff12106c4
|
[geoplanet] adding Island place_type
|
2016-10-06 14:04:28 -04:00 |
|
Al
|
b1f386cb11
|
[fix] typo
|
2016-10-06 01:37:42 -04:00 |
|
Al
|
7d5ef87348
|
[fix] geoplanet zip file
|
2016-10-06 01:37:30 -04:00 |
|
Al
|
a67efcffe4
|
[addresses] add new option to use city population to determine whether components should be dropped out
|
2016-10-05 18:16:25 -04:00 |
|
Al
|
66af532850
|
[osm] adding country-specific cleanups to OSM place training data
|
2016-10-05 17:13:13 -04:00 |
|
Al
|
6b0186782d
|
[openaddresses] doing country-specific cleanups in OpenAddresses
|
2016-10-05 17:07:29 -04:00 |
|