Commit Graph

90 Commits

Author SHA1 Message Date
Al
373708b595 [openaddresses] replace name affixes (remove things like "city of"), prune duplicate names, remove numeric boundary names, cleanup boundary names, and add house number + postcode phrases where appropriate 2016-09-22 00:57:11 -04:00
Al
d667039397 [openaddresses] for configs with add_osm_boundaries=true, skip adding boundary fields from the OA file altogether when they're specified 2016-09-16 01:55:36 -04:00
Al
95cf6ad0fa [fix] default again 2016-09-16 01:11:59 -04:00
Al
d5a5104de9 [fix] default 2016-09-16 01:10:19 -04:00
Al
32ad1d7bd0 [fix] var name 2016-09-16 01:07:10 -04:00
Al
b618d1eaf2 [fix] var name 2016-09-16 01:02:47 -04:00
Al
9b250a9393 [openaddresses] adding zero-padding option for postcodes and using in Puerto Rico 2016-09-15 11:22:55 -04:00
Al
551cce8cb1 [fix] making a separate gazetteer for toponym abbreviations 2016-09-10 01:08:58 -04:00
Al
bcde9e2fe7 [fix] toponym abbreviations after country name, may want to use it 2016-09-10 00:49:31 -04:00
Al
bbc5131cb6 [fix] toponym abbreviations 2016-09-10 00:48:31 -04:00
Al
19a044f7f3 [fix] imports 2016-09-10 00:09:11 -04:00
Al
ae02b0769d [openaddresses] abbreviating boundary components for OpenAddresses 2016-09-10 00:04:11 -04:00
Al
5d26ab41e7 [openaddresses] removing OpenAddresses hacks now that upstream changes are merged 2016-09-09 09:40:45 -04:00
Al
170e8d74d8 [fix] checking for components 2016-09-08 03:19:10 -04:00
Al
769a65b808 [openaddresses] adding place-only and place+postcode probability to OpenAddresses to capture more place names not in OSM as standalone queries 2016-09-08 03:17:21 -04:00
Al
7e7ee7462a [fix] dutch house number formatting, strip spaces 2016-09-02 14:47:52 -04:00
Al
95384e5a2c [openaddresses adding hack for Honolulu until join function can handle null in OpenAddresses 2016-09-02 14:29:40 -04:00
Al
4e9f88594b [fix] /safe_encode/safe_decode/ 2016-09-02 13:50:48 -04:00
Al
8fd69b5e4a [fix] args 2016-09-02 12:03:24 -04:00
Al
df8e781e02 [openaddresses] adding hack for Italy until machine's join function handles null fields 2016-09-02 12:01:04 -04:00
Al
5957f45f40 [fix] strings 2016-09-02 05:00:39 -04:00
Al
4ab749d962 [fix] format_address with minimal_only=False 2016-09-02 04:59:03 -04:00
Al
d70662e6d7 [fix] postcodes 2016-09-02 04:42:34 -04:00
Al
bb1c071623 [fix] config move 2016-09-02 04:16:59 -04:00
Al
552ebf2bcf [fix] var name 2016-09-02 04:03:47 -04:00
Al
a4a09fcb3e [openaddresses] don't allow postcodes that are all zeroes with a dash (Poland, US ZIP+4) 2016-09-02 03:39:28 -04:00
Al
3f7bfca1ad [openaddresses] allowing house numbers with slashes as well as number + specific fractions separated by space 2016-09-02 02:53:26 -04:00
Al
cdfa9e11bf [openaddresses] excluding all streets with "unknown" in the name. Though possibly excluding one or two valid addresses, the gains far outweigh the costs 2016-09-01 17:45:12 -04:00
Al
3aef7e5b8b [openaddresses] making a few methods classmethods so they're easier to test 2016-09-01 17:42:07 -04:00
Al
c3c949a147 [openaddresses] adding the Netherlands with some hacks for house number until the new format function is deployed in OpenAddresses 2016-09-01 17:41:27 -04:00
Al
e98cf67f0e [openaddresses] also allowing house numbers like "37/A" 2016-08-29 22:56:36 -04:00
Al
78a210c409 [openaddresses] replacing backticks with apostrophe, comes up in several countries 2016-08-29 21:42:10 -04:00
Al
3f5b3dcb1d [openaddresses] Allowing slashes in house numbers in OpenAddresses 2016-08-29 21:26:33 -04:00
Al
ebb34bcc2f [openaddresses] config option to skip rows missing specific fields 2016-08-29 19:19:32 -04:00
Al
f5b2b6327e [openaddresses] Using a download script to download the individual OA files of interest rather than the collected file with expansions applied 2016-08-29 00:34:39 -04:00
Al
a0cf6ff225 [openaddresses] Allowing house numbers like "11 C" 2016-08-28 19:11:41 -04:00
Al
ac403bbe49 [openaddresses] Adding sin numero validator (sem numero in this case) for Portuguese 2016-08-28 18:39:19 -04:00
Al
27c5c8536a [openaddresses] adding debug argument to OpenAddresses training data 2016-08-28 17:58:41 -04:00
Al
6740e5a1c6 [fix] var name 2016-08-28 17:55:10 -04:00
Al
7ea47126ba [fix] logging 2016-08-28 15:54:55 -04:00
Al
a58194ca2e [fix] add_admin_boundaries and adding cleaned up house number 2016-08-28 15:15:57 -04:00
Al
51590825ee [fix] do component dropout anyway 2016-08-28 14:07:49 -04:00
Al
f69e63e311 [openaddresses] Place component dropout. Obtain population from OSM components when we have them but otherwise assume it's actually 0 (not unknown), that way the more conservative probabilities will be used i.e. state names will be included more often rather than unqualified cities 2016-08-28 13:59:28 -04:00
Al
dea5fbbf2e [logging] printing off filenames in constructing OpenAddresses training data 2016-08-28 12:11:53 -04:00
Al
3da80b0706 [fix] typo 2016-08-28 11:55:40 -04:00
Al
aa62b8e8b4 [fix] indentation 2016-08-28 11:48:27 -04:00
Al
b8b1ac1261 [openaddresses] Handling validation after cleanup, adding per-field regex replacements 2016-08-28 11:47:30 -04:00
Al
3ae7a15960 [openaddresses] Adding a few special cases for Spanish. Rewrite simple numeric street names to include the oft-omitted Calle (e.g. 27 => Calle 27), which is uniformly omitted in the Spanish-language data in OpenAddresses while still being valid for grid-based cities like Mérida. Humans and signs usually add Calle for numeric streets while it may be omitted for named streets 2016-08-27 15:03:23 -04:00
Al
15f9817933 [openaddresses] Replacing number sign in house number 2016-08-27 02:42:06 -04:00
Al
01ac1371b5 [openaddresses] Cleaning up house numbers as well, which can sometimes be stored as floats 2016-08-27 01:50:05 -04:00