Al
027fbc5afc
[fix] filename
2018-03-11 03:26:37 -04:00
Al
b4a0c79d64
[fix] adding the tempfiles in the datadir, where the user must have permissions for this to work anyway
2018-03-10 22:24:03 -05:00
Al
15cb5f68ad
[fix] making this work with sh...wondering if supporting the few shells that don't have bash is worth losing array functionality.
2018-03-10 22:12:46 -05:00
Al
0c91379424
[build/fix] using Github for the model releases rather than the Mapzen S3 buckets which are no longer working after the shutdown. It requires a little more effort to get the metadata, but downloads should still be just as fast since Github releases are on S3 as well. Note: still need to implement the upload piece, but this at least provides a model endpoint for users.
2018-03-10 19:03:30 -05:00
Al
95e483e3ca
[fix] case-insensitive comparison of content-length header in data download script
2018-01-17 17:31:42 -05:00
Al
669e52b329
[build] adding --no-same-owner explicitly when untarring the data files for #267
2017-11-01 20:05:36 -04:00
Al
d2732922c2
[data] deployed model files and training data to CloudFront for easier downloading around the world and in places like China where the Great Fire Wall may prevent large downloads from abroad. TTL is set to 0 so it still caches the files themselves but checks with origin for the If-Modified-Since headers, allowing the files to be updated dynamically
2017-04-17 14:11:44 -04:00
Al
7f7aada32a
[build] add another housekeeping file in the datadir for data_version. Blow away the exiting files if that file either doesn't exist or doesn't contain a matching version string to help with upgrades
2017-04-07 17:40:27 -04:00
Al
5a96be5d5c
[fix][ci skip] S3 upload paths in data upload/download script
2017-04-06 00:37:12 -04:00
Al
267be6c05c
[data] 12 worker pool in data download instead of 10 to download the new parser in one shot
2017-03-31 15:52:17 -04:00
Al
a64c81b45b
[data/models] updating libpostal download script to download new models. The simple data files are stored by libpostal major version, whereas the models are stored by the version of the training data they used. A file called "latest" is stored in S3 to indicate the latest version of the model and checked on make
2017-03-31 13:35:07 -04:00
Al
6d4c7984df
[api] doing this now since we're bumping a major version. Using a libpostal prefixes for all public header functions and definitions
2017-03-31 03:35:51 -04:00
Brad Hards
fb68e22bbf
[fix] Use UTC date reference to avoid repeating S3 downloads.
...
Resolves https://github.com/openvenues/libpostal/issues/143
2016-12-26 12:04:02 +11:00
Al
d575caba8a
[data] using UTC for libpostal data files on the Mac version of the download script as well
2016-12-09 19:43:05 -05:00
Al
c3f3896b48
[fix] update test for date function in data download script
2016-12-09 19:29:00 -05:00
Al
01afbf80ef
[data] Each curl process will retry the chunk up to 3 times
2016-08-25 23:18:39 -04:00
Tom Davis
18c8e90eb3
Use xargs to start workers as soon as possible
2016-07-27 17:46:44 -04:00
Tom Davis
11abf6cb22
Use posix sh for systems without bash
2016-07-26 20:17:18 -04:00
Tom Davis
2991ffd193
Don't call download_multipart for 1 chunk
...
Previously, where a file was larger than `$LARGE_FILE_SIZE` but smaller
than `$CHUNK_SIZE*2`, `download_multipart` would be called but would
only download one (1) chunk that was the whole file.
This fix keeps the same download performance as before but optimizes
processing chunks out.
2016-07-23 16:41:04 -04:00
Tom Davis
24e0314e71
Remove call to seq which may not exist
2016-07-23 01:03:15 -04:00
Al
ad9dfb46bd
[build] Using a process pool with 64MB chunks (similar to aws cli) for S3 downloads. Setting the max concurrent requeests to 10, also the default in aws cli.
2016-07-01 14:37:13 -04:00
Al
a9ba61585b
[fix] Adding set -e to data download script so it fails if any subcommands fail
2016-05-04 23:08:06 -04:00
Al
59e5fcd1b4
[fix] LC_ALL=C in data download script
2016-04-11 12:47:50 -04:00
Al
0d7f9f2032
[data] Using UTC dates for libpostal data file tracking for #38 . Also silencing curl when checking if file was updated
2016-03-10 16:44:02 -05:00
Al
c0b548833b
[fix] create data dir if it doesn't exist
2016-01-30 13:40:10 -05:00
Al
789db8f582
[build] Adding language classifier to data file download script. As the current file is rather large, added multipart downloads from S3 to speed things up
2016-01-27 03:31:45 -05:00
Al
2950358697
[build] address_parser client now links to libpostal, adding address_parser to download script with an "all" option
2015-12-12 12:49:50 -05:00
Al
6aaa08c220
[fix] Usage on libpostal_data script
2015-10-27 13:33:03 -04:00
Al
588cf1df86
[build] Changing options to libpostal_data script to allow downloading geodb, uploaded first version to S3
2015-10-11 22:25:37 -05:00
Al
91f4e477ad
[fix] typo
2015-10-06 12:04:07 -04:00
Al
abfa744d59
[build] Adding libpostal_data script for downloading data from S3, Makefile uses that now as part of the all-local target. Can be run periodically after install
2015-09-28 17:26:15 -04:00