Al
|
ebb34bcc2f
|
[openaddresses] config option to skip rows missing specific fields
|
2016-08-29 19:19:32 -04:00 |
|
Al
|
f5b2b6327e
|
[openaddresses] Using a download script to download the individual OA files of interest rather than the collected file with expansions applied
|
2016-08-29 00:34:39 -04:00 |
|
Al
|
a0cf6ff225
|
[openaddresses] Allowing house numbers like "11 C"
|
2016-08-28 19:11:41 -04:00 |
|
Al
|
ac403bbe49
|
[openaddresses] Adding sin numero validator (sem numero in this case) for Portuguese
|
2016-08-28 18:39:19 -04:00 |
|
Al
|
27c5c8536a
|
[openaddresses] adding debug argument to OpenAddresses training data
|
2016-08-28 17:58:41 -04:00 |
|
Al
|
6740e5a1c6
|
[fix] var name
|
2016-08-28 17:55:10 -04:00 |
|
Al
|
7ea47126ba
|
[fix] logging
|
2016-08-28 15:54:55 -04:00 |
|
Al
|
a58194ca2e
|
[fix] add_admin_boundaries and adding cleaned up house number
|
2016-08-28 15:15:57 -04:00 |
|
Al
|
51590825ee
|
[fix] do component dropout anyway
|
2016-08-28 14:07:49 -04:00 |
|
Al
|
f69e63e311
|
[openaddresses] Place component dropout. Obtain population from OSM components when we have them but otherwise assume it's actually 0 (not unknown), that way the more conservative probabilities will be used i.e. state names will be included more often rather than unqualified cities
|
2016-08-28 13:59:28 -04:00 |
|
Al
|
dea5fbbf2e
|
[logging] printing off filenames in constructing OpenAddresses training data
|
2016-08-28 12:11:53 -04:00 |
|
Al
|
3da80b0706
|
[fix] typo
|
2016-08-28 11:55:40 -04:00 |
|
Al
|
aa62b8e8b4
|
[fix] indentation
|
2016-08-28 11:48:27 -04:00 |
|
Al
|
b8b1ac1261
|
[openaddresses] Handling validation after cleanup, adding per-field regex replacements
|
2016-08-28 11:47:30 -04:00 |
|
Al
|
3ae7a15960
|
[openaddresses] Adding a few special cases for Spanish. Rewrite simple numeric street names to include the oft-omitted Calle (e.g. 27 => Calle 27), which is uniformly omitted in the Spanish-language data in OpenAddresses while still being valid for grid-based cities like Mérida. Humans and signs usually add Calle for numeric streets while it may be omitted for named streets
|
2016-08-27 15:03:23 -04:00 |
|
Al
|
15f9817933
|
[openaddresses] Replacing number sign in house number
|
2016-08-27 02:42:06 -04:00 |
|
Al
|
01ac1371b5
|
[openaddresses] Cleaning up house numbers as well, which can sometimes be stored as floats
|
2016-08-27 01:50:05 -04:00 |
|
Al
|
4ed394cc1c
|
[openaddresses] Omitting fields with the value "unknown"
|
2016-08-27 00:46:21 -04:00 |
|
Al
|
6723fff9b4
|
[fix] unit phrases
|
2016-08-27 00:23:51 -04:00 |
|
Al
|
d29e4f3b2e
|
[openaddresses] Adding optional hyphen between unit number
|
2016-08-26 23:46:19 -04:00 |
|
Al
|
8c6a4c763c
|
[openaddresses] Increasing limit to 3 characters for unit abbreviations in case anything clashes (not a huge issue if a few units are tacked on, but this seems more common in OpenAddresses than OSM)
|
2016-08-26 23:43:53 -04:00 |
|
Al
|
12d429b63d
|
[openaddresses] Simple regex-based method to strip unit phrases tacked onto the end of a street
|
2016-08-26 22:39:13 -04:00 |
|
Al
|
318ad2a0c4
|
[openaddresses] Removing <Null> tag from values in OpenAddresses, seeing it in Colorado county files
|
2016-08-26 21:42:00 -04:00 |
|
Al
|
0f9e8ee95d
|
[openaddresses] Better handling of float postcodes
|
2016-08-26 20:16:04 -04:00 |
|
Al
|
56329439af
|
[openaddresses] some postcodes in OpenAddresses are stored as floats, convert to int and then to string if that's the case
|
2016-08-26 19:12:48 -04:00 |
|
Al
|
2b9d58dcbe
|
[openaddresses] Ignoring fields with null-like values as well (there appear to be no valid places named Null or None...yet)
|
2016-08-26 15:49:36 -04:00 |
|
Al
|
2654683af4
|
[openaddresses] Adding quick-and-dirty regex-based exclusion list for fields containing various patterns in OpenAddresses, to be used sparingly
|
2016-08-26 15:35:51 -04:00 |
|
Al
|
4e9f9e8957
|
[openaddresses] Replace multiple spaces with single space
|
2016-08-26 12:45:49 -04:00 |
|
Al
|
9e89147c83
|
[openaddresses] removing spaces in numeric ranges in OpenAddresses, sometimes see things like '12 -23'
|
2016-08-26 12:30:15 -04:00 |
|
Al
|
3b2c86d240
|
[fix] strip values in OpenAddresses components
|
2016-08-26 10:24:34 -04:00 |
|
Al
|
b2f8180d19
|
[openaddresses] Ignore any fields in OpenAddresses which have N/A as a value
|
2016-08-25 23:58:38 -04:00 |
|
Al
|
c23a7a4030
|
[openaddresses] Ditto for numeric boundary names
|
2016-08-25 22:58:52 -04:00 |
|
Al
|
34b01e203d
|
[openaddresses] Don't allow single-letter boundary names as they're probably just typos
|
2016-08-25 22:58:26 -04:00 |
|
Al
|
859868aea2
|
[openaddresses] Adding option to strip non-digits from postcode, addresses with a postcode and no house_number+street may still be useful, keeping them around as place queries to help with postcode contexts
|
2016-08-25 16:36:18 -04:00 |
|
Al
|
da619e3cf4
|
[osm] Adding border_type=city to override tags
|
2016-08-25 15:21:33 -04:00 |
|
Al
|
a6dad74a2b
|
[openaddresses] cleaning comma-delimited boundary components in OpenAddresses data sets
|
2016-08-24 15:06:04 -04:00 |
|
Al
|
d250f58293
|
[openaddresses] Also skipping addresses where street == unit
|
2016-08-24 14:10:41 -04:00 |
|
Al
|
7c3ad708d8
|
[openaddresses] Ensuring integer house numbers are > 0, street is not simply a numeric token (usually a copy of the house number) and that street != house number generally
|
2016-08-24 13:46:56 -04:00 |
|
Al
|
b7c600e496
|
[openaddresses] adding numeric_postcodes_only and add_osm_neighborhoods options
|
2016-08-23 02:11:21 -04:00 |
|
Al
|
ed0b49884e
|
[openaddresses] Changes to OA config utilizing some of the new cleanup options. Adding language to brussels-fr and brussels-nl, adding New York and New Jersey statewide with the understanding that OSM components will be added in NJ and postcodes will be stripped of letters in NY
|
2016-08-23 00:38:43 -04:00 |
|
Al
|
8ec288d8f8
|
[openaddresses] Adding ability to specify language of a particular OpenAddresses CSV a priori. Unless otherwise specified, non-numeric unit fields will be discarded and phrases will be added randomly for numeric unit fields.
|
2016-08-23 00:29:09 -04:00 |
|
Al
|
23be122d2e
|
[openaddresses] Adding ability to use OSM boundaries for OpenAddresses (not turned on by default), cleaning up street names, requiring at least house number and street, validating house number to provide some assurance that it's not a badly-formatted NULL value, adding ability to strip letters from postcode for data sets like New York's statewide where there are some codes attached.
|
2016-08-22 22:09:00 -04:00 |
|
Al
|
cec4914233
|
[openaddresses] In some OpenAddresses data sets, the house number is just a copy of the street name, so eliminate non-numeric house numbers to be safe
|
2016-07-31 01:12:04 -04:00 |
|
Al
|
0bbced4966
|
[fix] subdir config in OpenAddresses formatter
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
77a4476b8e
|
[openaddresses] CLDR country names for OpenAddresses training set
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
584a4e0ee8
|
[openaddresses] Added components via OA config
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
55d66af422
|
[openaddresses] Adding abbreviated unit
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
d910c6ca94
|
[fix] OpenAddresses formatting
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
802a5ee534
|
[fix] condition
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
e6a1d11324
|
[fix] validators
|
2016-07-21 17:04:57 -04:00 |
|