Al
|
4ab749d962
|
[fix] format_address with minimal_only=False
|
2016-09-02 04:59:03 -04:00 |
|
Al
|
d70662e6d7
|
[fix] postcodes
|
2016-09-02 04:42:34 -04:00 |
|
Al
|
bb1c071623
|
[fix] config move
|
2016-09-02 04:16:59 -04:00 |
|
Al
|
552ebf2bcf
|
[fix] var name
|
2016-09-02 04:03:47 -04:00 |
|
Al
|
a4a09fcb3e
|
[openaddresses] don't allow postcodes that are all zeroes with a dash (Poland, US ZIP+4)
|
2016-09-02 03:39:28 -04:00 |
|
Al
|
3f7bfca1ad
|
[openaddresses] allowing house numbers with slashes as well as number + specific fractions separated by space
|
2016-09-02 02:53:26 -04:00 |
|
Al
|
cdfa9e11bf
|
[openaddresses] excluding all streets with "unknown" in the name. Though possibly excluding one or two valid addresses, the gains far outweigh the costs
|
2016-09-01 17:45:12 -04:00 |
|
Al
|
3aef7e5b8b
|
[openaddresses] making a few methods classmethods so they're easier to test
|
2016-09-01 17:42:07 -04:00 |
|
Al
|
c3c949a147
|
[openaddresses] adding the Netherlands with some hacks for house number until the new format function is deployed in OpenAddresses
|
2016-09-01 17:41:27 -04:00 |
|
Al
|
d7dab92f7b
|
[fix] var name
|
2016-08-31 17:45:50 -04:00 |
|
Al
|
be6c01f5fd
|
[fix] csv
|
2016-08-31 17:45:04 -04:00 |
|
Al
|
d3da513375
|
[fix] import
|
2016-08-31 17:44:16 -04:00 |
|
Al
|
4ed362d5f8
|
[openaddresses] adding script option to download all completed OA files instead of just what's in the config
|
2016-08-31 17:43:07 -04:00 |
|
Al
|
e98cf67f0e
|
[openaddresses] also allowing house numbers like "37/A"
|
2016-08-29 22:56:36 -04:00 |
|
Al
|
78a210c409
|
[openaddresses] replacing backticks with apostrophe, comes up in several countries
|
2016-08-29 21:42:10 -04:00 |
|
Al
|
3f5b3dcb1d
|
[openaddresses] Allowing slashes in house numbers in OpenAddresses
|
2016-08-29 21:26:33 -04:00 |
|
Al
|
ebb34bcc2f
|
[openaddresses] config option to skip rows missing specific fields
|
2016-08-29 19:19:32 -04:00 |
|
Al
|
9b9036243c
|
[fix] overwrite on unzip, logging
|
2016-08-29 00:40:11 -04:00 |
|
Al
|
5b5af04a44
|
[fix] redundant line
|
2016-08-29 00:37:17 -04:00 |
|
Al
|
6284ec39db
|
[fix] name
|
2016-08-29 00:36:45 -04:00 |
|
Al
|
75ece5f5e9
|
[fix] import
|
2016-08-29 00:36:22 -04:00 |
|
Al
|
f5b2b6327e
|
[openaddresses] Using a download script to download the individual OA files of interest rather than the collected file with expansions applied
|
2016-08-29 00:34:39 -04:00 |
|
Al
|
a0cf6ff225
|
[openaddresses] Allowing house numbers like "11 C"
|
2016-08-28 19:11:41 -04:00 |
|
Al
|
ac403bbe49
|
[openaddresses] Adding sin numero validator (sem numero in this case) for Portuguese
|
2016-08-28 18:39:19 -04:00 |
|
Al
|
27c5c8536a
|
[openaddresses] adding debug argument to OpenAddresses training data
|
2016-08-28 17:58:41 -04:00 |
|
Al
|
6740e5a1c6
|
[fix] var name
|
2016-08-28 17:55:10 -04:00 |
|
Al
|
7ea47126ba
|
[fix] logging
|
2016-08-28 15:54:55 -04:00 |
|
Al
|
a58194ca2e
|
[fix] add_admin_boundaries and adding cleaned up house number
|
2016-08-28 15:15:57 -04:00 |
|
Al
|
51590825ee
|
[fix] do component dropout anyway
|
2016-08-28 14:07:49 -04:00 |
|
Al
|
f69e63e311
|
[openaddresses] Place component dropout. Obtain population from OSM components when we have them but otherwise assume it's actually 0 (not unknown), that way the more conservative probabilities will be used i.e. state names will be included more often rather than unqualified cities
|
2016-08-28 13:59:28 -04:00 |
|
Al
|
dea5fbbf2e
|
[logging] printing off filenames in constructing OpenAddresses training data
|
2016-08-28 12:11:53 -04:00 |
|
Al
|
3da80b0706
|
[fix] typo
|
2016-08-28 11:55:40 -04:00 |
|
Al
|
aa62b8e8b4
|
[fix] indentation
|
2016-08-28 11:48:27 -04:00 |
|
Al
|
b8b1ac1261
|
[openaddresses] Handling validation after cleanup, adding per-field regex replacements
|
2016-08-28 11:47:30 -04:00 |
|
Al
|
3ae7a15960
|
[openaddresses] Adding a few special cases for Spanish. Rewrite simple numeric street names to include the oft-omitted Calle (e.g. 27 => Calle 27), which is uniformly omitted in the Spanish-language data in OpenAddresses while still being valid for grid-based cities like Mérida. Humans and signs usually add Calle for numeric streets while it may be omitted for named streets
|
2016-08-27 15:03:23 -04:00 |
|
Al
|
15f9817933
|
[openaddresses] Replacing number sign in house number
|
2016-08-27 02:42:06 -04:00 |
|
Al
|
01ac1371b5
|
[openaddresses] Cleaning up house numbers as well, which can sometimes be stored as floats
|
2016-08-27 01:50:05 -04:00 |
|
Al
|
4ed394cc1c
|
[openaddresses] Omitting fields with the value "unknown"
|
2016-08-27 00:46:21 -04:00 |
|
Al
|
6723fff9b4
|
[fix] unit phrases
|
2016-08-27 00:23:51 -04:00 |
|
Al
|
d29e4f3b2e
|
[openaddresses] Adding optional hyphen between unit number
|
2016-08-26 23:46:19 -04:00 |
|
Al
|
8c6a4c763c
|
[openaddresses] Increasing limit to 3 characters for unit abbreviations in case anything clashes (not a huge issue if a few units are tacked on, but this seems more common in OpenAddresses than OSM)
|
2016-08-26 23:43:53 -04:00 |
|
Al
|
12d429b63d
|
[openaddresses] Simple regex-based method to strip unit phrases tacked onto the end of a street
|
2016-08-26 22:39:13 -04:00 |
|
Al
|
318ad2a0c4
|
[openaddresses] Removing <Null> tag from values in OpenAddresses, seeing it in Colorado county files
|
2016-08-26 21:42:00 -04:00 |
|
Al
|
0f9e8ee95d
|
[openaddresses] Better handling of float postcodes
|
2016-08-26 20:16:04 -04:00 |
|
Al
|
56329439af
|
[openaddresses] some postcodes in OpenAddresses are stored as floats, convert to int and then to string if that's the case
|
2016-08-26 19:12:48 -04:00 |
|
Al
|
2b9d58dcbe
|
[openaddresses] Ignoring fields with null-like values as well (there appear to be no valid places named Null or None...yet)
|
2016-08-26 15:49:36 -04:00 |
|
Al
|
2654683af4
|
[openaddresses] Adding quick-and-dirty regex-based exclusion list for fields containing various patterns in OpenAddresses, to be used sparingly
|
2016-08-26 15:35:51 -04:00 |
|
Al
|
4e9f9e8957
|
[openaddresses] Replace multiple spaces with single space
|
2016-08-26 12:45:49 -04:00 |
|
Al
|
9e89147c83
|
[openaddresses] removing spaces in numeric ranges in OpenAddresses, sometimes see things like '12 -23'
|
2016-08-26 12:30:15 -04:00 |
|
Al
|
3b2c86d240
|
[fix] strip values in OpenAddresses components
|
2016-08-26 10:24:34 -04:00 |
|