Al
|
78a210c409
|
[openaddresses] replacing backticks with apostrophe, comes up in several countries
|
2016-08-29 21:42:10 -04:00 |
|
Al
|
3f5b3dcb1d
|
[openaddresses] Allowing slashes in house numbers in OpenAddresses
|
2016-08-29 21:26:33 -04:00 |
|
Al
|
ebb34bcc2f
|
[openaddresses] config option to skip rows missing specific fields
|
2016-08-29 19:19:32 -04:00 |
|
Al
|
9b9036243c
|
[fix] overwrite on unzip, logging
|
2016-08-29 00:40:11 -04:00 |
|
Al
|
5b5af04a44
|
[fix] redundant line
|
2016-08-29 00:37:17 -04:00 |
|
Al
|
6284ec39db
|
[fix] name
|
2016-08-29 00:36:45 -04:00 |
|
Al
|
75ece5f5e9
|
[fix] import
|
2016-08-29 00:36:22 -04:00 |
|
Al
|
f5b2b6327e
|
[openaddresses] Using a download script to download the individual OA files of interest rather than the collected file with expansions applied
|
2016-08-29 00:34:39 -04:00 |
|
Al
|
4d36e2553a
|
[utils] Using curl with redirects and retries for download_file
|
2016-08-29 00:32:29 -04:00 |
|
Al
|
a0cf6ff225
|
[openaddresses] Allowing house numbers like "11 C"
|
2016-08-28 19:11:41 -04:00 |
|
Al
|
ac403bbe49
|
[openaddresses] Adding sin numero validator (sem numero in this case) for Portuguese
|
2016-08-28 18:39:19 -04:00 |
|
Al
|
27c5c8536a
|
[openaddresses] adding debug argument to OpenAddresses training data
|
2016-08-28 17:58:41 -04:00 |
|
Al
|
6740e5a1c6
|
[fix] var name
|
2016-08-28 17:55:10 -04:00 |
|
Al
|
7ea47126ba
|
[fix] logging
|
2016-08-28 15:54:55 -04:00 |
|
Al
|
a58194ca2e
|
[fix] add_admin_boundaries and adding cleaned up house number
|
2016-08-28 15:15:57 -04:00 |
|
Al
|
bae04eb543
|
[fix] int
|
2016-08-28 14:11:25 -04:00 |
|
Al
|
de0a7bfe4f
|
[fix] /or/and/
|
2016-08-28 14:09:30 -04:00 |
|
Al
|
51590825ee
|
[fix] do component dropout anyway
|
2016-08-28 14:07:49 -04:00 |
|
Al
|
44e59e8daf
|
[fix] return the original for already abbreviated tokens
|
2016-08-28 14:05:58 -04:00 |
|
Al
|
f69e63e311
|
[openaddresses] Place component dropout. Obtain population from OSM components when we have them but otherwise assume it's actually 0 (not unknown), that way the more conservative probabilities will be used i.e. state names will be included more often rather than unqualified cities
|
2016-08-28 13:59:28 -04:00 |
|
Al
|
dea5fbbf2e
|
[logging] printing off filenames in constructing OpenAddresses training data
|
2016-08-28 12:11:53 -04:00 |
|
Al
|
3cf3e401db
|
[fix] abbreviation recasing
|
2016-08-28 12:04:36 -04:00 |
|
Al
|
3da80b0706
|
[fix] typo
|
2016-08-28 11:55:40 -04:00 |
|
Al
|
aa62b8e8b4
|
[fix] indentation
|
2016-08-28 11:48:27 -04:00 |
|
Al
|
b8b1ac1261
|
[openaddresses] Handling validation after cleanup, adding per-field regex replacements
|
2016-08-28 11:47:30 -04:00 |
|
Al
|
3ae7a15960
|
[openaddresses] Adding a few special cases for Spanish. Rewrite simple numeric street names to include the oft-omitted Calle (e.g. 27 => Calle 27), which is uniformly omitted in the Spanish-language data in OpenAddresses while still being valid for grid-based cities like Mérida. Humans and signs usually add Calle for numeric streets while it may be omitted for named streets
|
2016-08-27 15:03:23 -04:00 |
|
Al
|
15f9817933
|
[openaddresses] Replacing number sign in house number
|
2016-08-27 02:42:06 -04:00 |
|
Al
|
01ac1371b5
|
[openaddresses] Cleaning up house numbers as well, which can sometimes be stored as floats
|
2016-08-27 01:50:05 -04:00 |
|
Al
|
4ed394cc1c
|
[openaddresses] Omitting fields with the value "unknown"
|
2016-08-27 00:46:21 -04:00 |
|
Al
|
6723fff9b4
|
[fix] unit phrases
|
2016-08-27 00:23:51 -04:00 |
|
Al
|
d29e4f3b2e
|
[openaddresses] Adding optional hyphen between unit number
|
2016-08-26 23:46:19 -04:00 |
|
Al
|
8c6a4c763c
|
[openaddresses] Increasing limit to 3 characters for unit abbreviations in case anything clashes (not a huge issue if a few units are tacked on, but this seems more common in OpenAddresses than OSM)
|
2016-08-26 23:43:53 -04:00 |
|
Al
|
12d429b63d
|
[openaddresses] Simple regex-based method to strip unit phrases tacked onto the end of a street
|
2016-08-26 22:39:13 -04:00 |
|
Al
|
318ad2a0c4
|
[openaddresses] Removing <Null> tag from values in OpenAddresses, seeing it in Colorado county files
|
2016-08-26 21:42:00 -04:00 |
|
Al
|
0f9e8ee95d
|
[openaddresses] Better handling of float postcodes
|
2016-08-26 20:16:04 -04:00 |
|
Al
|
56329439af
|
[openaddresses] some postcodes in OpenAddresses are stored as floats, convert to int and then to string if that's the case
|
2016-08-26 19:12:48 -04:00 |
|
Al
|
2b9d58dcbe
|
[openaddresses] Ignoring fields with null-like values as well (there appear to be no valid places named Null or None...yet)
|
2016-08-26 15:49:36 -04:00 |
|
Al
|
2654683af4
|
[openaddresses] Adding quick-and-dirty regex-based exclusion list for fields containing various patterns in OpenAddresses, to be used sparingly
|
2016-08-26 15:35:51 -04:00 |
|
Al
|
4e9f9e8957
|
[openaddresses] Replace multiple spaces with single space
|
2016-08-26 12:45:49 -04:00 |
|
Al
|
9e89147c83
|
[openaddresses] removing spaces in numeric ranges in OpenAddresses, sometimes see things like '12 -23'
|
2016-08-26 12:30:15 -04:00 |
|
Al
|
3b2c86d240
|
[fix] strip values in OpenAddresses components
|
2016-08-26 10:24:34 -04:00 |
|
Al
|
b2f8180d19
|
[openaddresses] Ignore any fields in OpenAddresses which have N/A as a value
|
2016-08-25 23:58:38 -04:00 |
|
Al
|
c23a7a4030
|
[openaddresses] Ditto for numeric boundary names
|
2016-08-25 22:58:52 -04:00 |
|
Al
|
34b01e203d
|
[openaddresses] Don't allow single-letter boundary names as they're probably just typos
|
2016-08-25 22:58:26 -04:00 |
|
Al
|
859868aea2
|
[openaddresses] Adding option to strip non-digits from postcode, addresses with a postcode and no house_number+street may still be useful, keeping them around as place queries to help with postcode contexts
|
2016-08-25 16:36:18 -04:00 |
|
Al
|
da619e3cf4
|
[osm] Adding border_type=city to override tags
|
2016-08-25 15:21:33 -04:00 |
|
Al
|
dd0ca5e008
|
[addresses] Adding admin_center properties to place components in add_admin_boundaries (only overriding for specified areas where the boundary may otherwise not have all the properties)
|
2016-08-25 01:20:06 -04:00 |
|
Al
|
2e7f8f1ae7
|
[abbreviations] Adding toponyms gazetteer for probabilistically abbreviating things like Mount=>Mt, Saint=>St, Fort=>Ft in place names
|
2016-08-24 18:52:00 -04:00 |
|
Al
|
dfa5c8e0a6
|
[abbreviations] Adding ability to abbreviate within hyphenated phrases e.g. Sint-Maarten => St.-Maarten
|
2016-08-24 18:50:24 -04:00 |
|
Al
|
a6dad74a2b
|
[openaddresses] cleaning comma-delimited boundary components in OpenAddresses data sets
|
2016-08-24 15:06:04 -04:00 |
|