[parser] adding chunked shuffle as a C function (writes each line to one of n random files, runs shuf on each file and concatenates the result). Adding a version which allows specifying a specific chunk size, and using a 2GB limit for address parser training. Allowing gshuf again for Mac as it seems the only problem there was not having enough memory when testing on a Mac laptop. The new limited-memory version should be fast enough.
This commit is contained in:
@@ -5,5 +5,7 @@
|
||||
#include <stdbool.h>
|
||||
|
||||
bool shuffle_file(char *filename);
|
||||
bool shuffle_file_chunked(char *filename, size_t parts);
|
||||
bool shuffle_file_chunked_size(char *filename, size_t chunk_size);
|
||||
|
||||
#endif
|
||||
Reference in New Issue
Block a user