[geodb] Adding sparkey dependency

This commit is contained in:
Al
2015-07-09 15:26:11 -04:00
parent 4f1b4756d0
commit fbef0a15fe
24 changed files with 4134 additions and 0 deletions

201
src/sparkey/LICENSE Normal file
View File

@@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright 2012-2013 Spotify AB
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

349
src/sparkey/MurmurHash3.c Normal file
View File

@@ -0,0 +1,349 @@
//-----------------------------------------------------------------------------
// MurmurHash3 was written by Austin Appleby, and is placed in the public
// domain. The author hereby disclaims copyright to this source code.
// Note - The x86 and x64 versions do _not_ produce the same results, as the
// algorithms are optimized for their respective platforms. You can still
// compile and run any of them on any platform, but your performance with the
// non-native version will be less than optimal.
#include "MurmurHash3.h"
#include "endiantools.h"
//-----------------------------------------------------------------------------
// Platform-specific functions and macros
// Microsoft Visual Studio
#if defined(_MSC_VER)
#define FORCE_INLINE __forceinline
#include <stdlib.h>
#define ROTL32(x,y) _rotl(x,y)
#define ROTL64(x,y) _rotl64(x,y)
#define BIG_CONSTANT(x) (x)
// Other compilers
#else // defined(_MSC_VER)
#define FORCE_INLINE __attribute__((always_inline)) inline
static inline uint32_t rotl32 ( uint32_t x, int8_t r )
{
return (x << r) | (x >> (32 - r));
}
static inline uint64_t rotl64 ( uint64_t x, int8_t r )
{
return (x << r) | (x >> (64 - r));
}
#define ROTL32(x,y) rotl32(x,y)
#define ROTL64(x,y) rotl64(x,y)
#define BIG_CONSTANT(x) (x##LLU)
#endif // !defined(_MSC_VER)
//-----------------------------------------------------------------------------
// Block read - if your platform needs to do endian-swapping or can only
// handle aligned reads, do the conversion here
FORCE_INLINE uint32_t getblock32 ( const uint32_t * p, int i )
{
return read_little_endian32((uint8_t *) p, 4*i);
}
FORCE_INLINE uint64_t getblock64 ( const uint64_t * p, int i )
{
return read_little_endian64((uint8_t *) p, 8*i);
}
//-----------------------------------------------------------------------------
// Finalization mix - force all bits of a hash block to avalanche
FORCE_INLINE uint32_t fmix32 ( uint32_t h )
{
h ^= h >> 16;
h *= 0x85ebca6b;
h ^= h >> 13;
h *= 0xc2b2ae35;
h ^= h >> 16;
return h;
}
//----------
FORCE_INLINE uint64_t fmix64 ( uint64_t k )
{
k ^= k >> 33;
k *= BIG_CONSTANT(0xff51afd7ed558ccd);
k ^= k >> 33;
k *= BIG_CONSTANT(0xc4ceb9fe1a85ec53);
k ^= k >> 33;
return k;
}
//-----------------------------------------------------------------------------
void MurmurHash3_x86_32 ( const void * key, int len,
uint32_t seed, void * out )
{
const uint8_t * data = (const uint8_t*)key;
const int nblocks = len / 4;
uint32_t h1 = seed;
const uint32_t c1 = 0xcc9e2d51;
const uint32_t c2 = 0x1b873593;
//----------
// body
const uint32_t * blocks = (const uint32_t *)(data + nblocks*4);
for(int i = -nblocks; i; i++)
{
uint32_t k1 = getblock32(blocks,i);
k1 *= c1;
k1 = ROTL32(k1,15);
k1 *= c2;
h1 ^= k1;
h1 = ROTL32(h1,13);
h1 = h1*5+0xe6546b64;
}
//----------
// tail
const uint8_t * tail = (const uint8_t*)(data + nblocks*4);
uint32_t k1 = 0;
switch(len & 3)
{
case 3: k1 ^= tail[2] << 16;
case 2: k1 ^= tail[1] << 8;
case 1: k1 ^= tail[0];
k1 *= c1; k1 = ROTL32(k1,15); k1 *= c2; h1 ^= k1;
};
//----------
// finalization
h1 ^= len;
h1 = fmix32(h1);
*(uint32_t*)out = h1;
}
//-----------------------------------------------------------------------------
void MurmurHash3_x86_128 ( const void * key, const int len,
uint32_t seed, void * out )
{
const uint8_t * data = (const uint8_t*)key;
const int nblocks = len / 16;
uint32_t h1 = seed;
uint32_t h2 = seed;
uint32_t h3 = seed;
uint32_t h4 = seed;
const uint32_t c1 = 0x239b961b;
const uint32_t c2 = 0xab0e9789;
const uint32_t c3 = 0x38b34ae5;
const uint32_t c4 = 0xa1e38b93;
//----------
// body
const uint32_t * blocks = (const uint32_t *)(data + nblocks*16);
for(int i = -nblocks; i; i++)
{
uint32_t k1 = getblock32(blocks,i*4+0);
uint32_t k2 = getblock32(blocks,i*4+1);
uint32_t k3 = getblock32(blocks,i*4+2);
uint32_t k4 = getblock32(blocks,i*4+3);
k1 *= c1; k1 = ROTL32(k1,15); k1 *= c2; h1 ^= k1;
h1 = ROTL32(h1,19); h1 += h2; h1 = h1*5+0x561ccd1b;
k2 *= c2; k2 = ROTL32(k2,16); k2 *= c3; h2 ^= k2;
h2 = ROTL32(h2,17); h2 += h3; h2 = h2*5+0x0bcaa747;
k3 *= c3; k3 = ROTL32(k3,17); k3 *= c4; h3 ^= k3;
h3 = ROTL32(h3,15); h3 += h4; h3 = h3*5+0x96cd1c35;
k4 *= c4; k4 = ROTL32(k4,18); k4 *= c1; h4 ^= k4;
h4 = ROTL32(h4,13); h4 += h1; h4 = h4*5+0x32ac3b17;
}
//----------
// tail
const uint8_t * tail = (const uint8_t*)(data + nblocks*16);
uint32_t k1 = 0;
uint32_t k2 = 0;
uint32_t k3 = 0;
uint32_t k4 = 0;
switch(len & 15)
{
case 15: k4 ^= tail[14] << 16;
case 14: k4 ^= tail[13] << 8;
case 13: k4 ^= tail[12] << 0;
k4 *= c4; k4 = ROTL32(k4,18); k4 *= c1; h4 ^= k4;
case 12: k3 ^= tail[11] << 24;
case 11: k3 ^= tail[10] << 16;
case 10: k3 ^= tail[ 9] << 8;
case 9: k3 ^= tail[ 8] << 0;
k3 *= c3; k3 = ROTL32(k3,17); k3 *= c4; h3 ^= k3;
case 8: k2 ^= tail[ 7] << 24;
case 7: k2 ^= tail[ 6] << 16;
case 6: k2 ^= tail[ 5] << 8;
case 5: k2 ^= tail[ 4] << 0;
k2 *= c2; k2 = ROTL32(k2,16); k2 *= c3; h2 ^= k2;
case 4: k1 ^= tail[ 3] << 24;
case 3: k1 ^= tail[ 2] << 16;
case 2: k1 ^= tail[ 1] << 8;
case 1: k1 ^= tail[ 0] << 0;
k1 *= c1; k1 = ROTL32(k1,15); k1 *= c2; h1 ^= k1;
};
//----------
// finalization
h1 ^= len; h2 ^= len; h3 ^= len; h4 ^= len;
h1 += h2; h1 += h3; h1 += h4;
h2 += h1; h3 += h1; h4 += h1;
h1 = fmix32(h1);
h2 = fmix32(h2);
h3 = fmix32(h3);
h4 = fmix32(h4);
h1 += h2; h1 += h3; h1 += h4;
h2 += h1; h3 += h1; h4 += h1;
((uint32_t*)out)[0] = h1;
((uint32_t*)out)[1] = h2;
((uint32_t*)out)[2] = h3;
((uint32_t*)out)[3] = h4;
}
//-----------------------------------------------------------------------------
void MurmurHash3_x64_128 ( const void * key, const int len,
const uint32_t seed, void * out )
{
const uint8_t * data = (const uint8_t*)key;
const int nblocks = len / 16;
uint64_t h1 = seed;
uint64_t h2 = seed;
const uint64_t c1 = BIG_CONSTANT(0x87c37b91114253d5);
const uint64_t c2 = BIG_CONSTANT(0x4cf5ad432745937f);
//----------
// body
const uint64_t * blocks = (const uint64_t *)(data);
for(int i = 0; i < nblocks; i++)
{
uint64_t k1 = getblock64(blocks,i*2+0);
uint64_t k2 = getblock64(blocks,i*2+1);
k1 *= c1; k1 = ROTL64(k1,31); k1 *= c2; h1 ^= k1;
h1 = ROTL64(h1,27); h1 += h2; h1 = h1*5+0x52dce729;
k2 *= c2; k2 = ROTL64(k2,33); k2 *= c1; h2 ^= k2;
h2 = ROTL64(h2,31); h2 += h1; h2 = h2*5+0x38495ab5;
}
//----------
// tail
const uint8_t * tail = (const uint8_t*)(data + nblocks*16);
uint64_t k1 = 0;
uint64_t k2 = 0;
switch(len & 15)
{
case 15: k2 ^= ((uint64_t)tail[14]) << 48;
case 14: k2 ^= ((uint64_t)tail[13]) << 40;
case 13: k2 ^= ((uint64_t)tail[12]) << 32;
case 12: k2 ^= ((uint64_t)tail[11]) << 24;
case 11: k2 ^= ((uint64_t)tail[10]) << 16;
case 10: k2 ^= ((uint64_t)tail[ 9]) << 8;
case 9: k2 ^= ((uint64_t)tail[ 8]) << 0;
k2 *= c2; k2 = ROTL64(k2,33); k2 *= c1; h2 ^= k2;
case 8: k1 ^= ((uint64_t)tail[ 7]) << 56;
case 7: k1 ^= ((uint64_t)tail[ 6]) << 48;
case 6: k1 ^= ((uint64_t)tail[ 5]) << 40;
case 5: k1 ^= ((uint64_t)tail[ 4]) << 32;
case 4: k1 ^= ((uint64_t)tail[ 3]) << 24;
case 3: k1 ^= ((uint64_t)tail[ 2]) << 16;
case 2: k1 ^= ((uint64_t)tail[ 1]) << 8;
case 1: k1 ^= ((uint64_t)tail[ 0]) << 0;
k1 *= c1; k1 = ROTL64(k1,31); k1 *= c2; h1 ^= k1;
};
//----------
// finalization
h1 ^= len; h2 ^= len;
h1 += h2;
h2 += h1;
h1 = fmix64(h1);
h2 = fmix64(h2);
h1 += h2;
h2 += h1;
((uint64_t*)out)[0] = h1;
((uint64_t*)out)[1] = h2;
}
//-----------------------------------------------------------------------------
uint64_t murmurhash32_hash(const uint8_t *buf, uint64_t len, uint32_t seed) {
uint32_t res;
MurmurHash3_x86_32(buf, len, seed, &res);
return res;
}
uint64_t murmurhash64_hash(const uint8_t *buf, uint64_t len, uint32_t seed) {
uint64_t res[2];
MurmurHash3_x64_128(buf, len, seed, res);
return res[0];
}

35
src/sparkey/MurmurHash3.h Normal file
View File

@@ -0,0 +1,35 @@
//-----------------------------------------------------------------------------
// MurmurHash3 was written by Austin Appleby, and is placed in the public
// domain. The author hereby disclaims copyright to this source code.
#ifndef _MURMURHASH3_H_
#define _MURMURHASH3_H_
//-----------------------------------------------------------------------------
// Platform-specific functions and macros
// Microsoft Visual Studio
#if defined(_MSC_VER)
typedef unsigned char uint8_t;
typedef unsigned long uint32_t;
typedef unsigned __int64 uint64_t;
// Other compilers
#else // defined(_MSC_VER)
#include <stdint.h>
#endif // !defined(_MSC_VER)
//-----------------------------------------------------------------------------
uint64_t murmurhash32_hash(const uint8_t *buf, uint64_t len, uint32_t seed);
uint64_t murmurhash64_hash(const uint8_t *buf, uint64_t len, uint32_t seed);
//-----------------------------------------------------------------------------
#endif // _MURMURHASH3_H_

77
src/sparkey/buf.c Normal file
View File

@@ -0,0 +1,77 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#include <stddef.h>
#include <unistd.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include "util.h"
#include "endiantools.h"
#include "buf.h"
sparkey_returncode buf_init(sparkey_buf *buf, ptrdiff_t size) {
buf->start = malloc(size);
if (buf->start == NULL) {
return SPARKEY_INTERNAL_ERROR;
}
buf->cur = buf->start;
buf->end = buf->start + size;
return SPARKEY_SUCCESS;
}
void buf_close(sparkey_buf *buf) {
free(buf->start);
buf->start = NULL;
buf->cur = NULL;
buf->end = NULL;
}
uint64_t buf_size(sparkey_buf *buf) {
return buf->end - buf->start;
}
uint64_t buf_remaining(sparkey_buf *buf) {
return buf->end - buf->cur;
}
uint64_t buf_used(sparkey_buf *buf) {
return buf->cur - buf->start;
}
sparkey_returncode buf_flushfile(sparkey_buf *buf, int fd) {
RETHROW(write_full(fd, buf->start, buf_used(buf)));
buf->cur = buf->start;
return SPARKEY_SUCCESS;
}
sparkey_returncode buf_add(sparkey_buf *buf, int fd, const uint8_t *data, ptrdiff_t len) {
while (1) {
ptrdiff_t remaining = buf_remaining(buf);
if (remaining >= len) {
memcpy(buf->cur, data, len);
buf->cur += len;
return SPARKEY_SUCCESS;
} else {
memcpy(buf->cur, data, remaining);
buf->cur += remaining;
data += remaining;
len -= remaining;
RETHROW(buf_flushfile(buf, fd));
}
}
return SPARKEY_SUCCESS;
}

48
src/sparkey/buf.h Normal file
View File

@@ -0,0 +1,48 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#ifndef BUF_H_INCLUDED
#define BUF_H_INCLUDED
#include <stddef.h>
#include <unistd.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include "util.h"
#include "endiantools.h"
typedef struct {
uint8_t *start;
uint8_t *cur;
uint8_t *end;
} sparkey_buf;
sparkey_returncode buf_init(sparkey_buf *buf, ptrdiff_t size);
void buf_close(sparkey_buf *buf);
uint64_t buf_size(sparkey_buf *buf);
uint64_t buf_remaining(sparkey_buf *buf);
uint64_t buf_used(sparkey_buf *buf);
sparkey_returncode buf_flushfile(sparkey_buf *buf, int fd);
sparkey_returncode buf_add(sparkey_buf *buf, int fd, const uint8_t *data, ptrdiff_t len);
#endif

141
src/sparkey/endiantools.c Normal file
View File

@@ -0,0 +1,141 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#if defined(__linux)
#include <byteswap.h>
#elif defined(__APPLE__)
#include <libkern/OSByteOrder.h>
#define bswap_32 OSSwapInt32
#define bswap_64 OSSwapInt64
#else
#error "no byteswap.h or libkern/OSByteOrder.h"
#endif
#include <stddef.h>
#include <errno.h>
#include <string.h>
#include <inttypes.h>
#include "util.h"
#include "endiantools.h"
#include "sparkey.h"
static sparkey_returncode _write_full(int fd, uint8_t *buf, size_t count) {
while (count > 0) {
ssize_t actual = write(fd, buf, count);
if (actual < 0) {
switch (errno) {
case EINTR:
case EAGAIN: continue;
case ENOSPC: return SPARKEY_OUT_OF_DISK;
case EFBIG: return SPARKEY_FILE_SIZE_EXCEEDED;
case EBADF: return SPARKEY_FILE_CLOSED;
default:
fprintf(stderr, "_write_full():%d bug: actual_written = %"PRIu64", wanted = %"PRIu64", errno = %d\n", __LINE__, (uint64_t)actual, (uint64_t)count, errno);
return SPARKEY_INTERNAL_ERROR;
}
}
count -= actual;
buf += actual;
}
return SPARKEY_SUCCESS;
}
sparkey_returncode write_full(int fd, uint8_t *buf, size_t count) {
const size_t block_size = 256*1024*1024;
size_t fullruns = count / block_size;
while (fullruns > 0) {
RETHROW(_write_full(fd, buf, block_size));
buf += block_size;
fullruns--;
}
return _write_full(fd, buf, count % block_size);
}
void write_little_endian32(uint8_t *buf, uint32_t value) {
buf[0] = (value >> 0) & 0xFF;
buf[1] = (value >> 8) & 0xFF;
buf[2] = (value >> 16) & 0xFF;
buf[3] = (value >> 24) & 0xFF;
}
sparkey_returncode fwrite_little_endian32(int fd, uint32_t value) {
uint8_t buf[4];
write_little_endian32(buf, value);
return write_full(fd, buf, 4);
}
void write_little_endian64(uint8_t *buf, uint64_t value) {
buf[0] = (value >> 0) & 0xFF;
buf[1] = (value >> 8) & 0xFF;
buf[2] = (value >> 16) & 0xFF;
buf[3] = (value >> 24) & 0xFF;
buf[4] = (value >> 32) & 0xFF;
buf[5] = (value >> 40) & 0xFF;
buf[6] = (value >> 48) & 0xFF;
buf[7] = (value >> 56) & 0xFF;
}
sparkey_returncode fwrite_little_endian64(int fd, uint64_t value) {
uint8_t buf[8];
write_little_endian64(buf, value);
return write_full(fd, buf, 8);
}
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ || defined(__LITTLE_ENDIAN) || defined(__LITTLE_ENDIAN__)
uint32_t read_little_endian32(const uint8_t * array, uint64_t pos) {
return *((uint32_t*)(array + pos));
}
uint64_t read_little_endian64(const uint8_t * array, uint64_t pos) {
return *((uint64_t*)(array + pos));
}
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ || defined(__BIG_ENDIAN) || defined(__BIG_ENDIAN__)
uint32_t read_little_endian32(const uint8_t * array, uint64_t pos) {
return bswap_32(*((uint32_t*)(array + pos)));
}
uint64_t read_little_endian64(const uint8_t * array, uint64_t pos) {
return bswap_64(*((uint64_t*)(array + pos)));
}
#else
#error "none of __LITTLE_ENDIAN, __LITTLE_ENDIAN__, __BIG_ENDIAN, __BIG_ENDIAN__ is defined"
#endif
sparkey_returncode correct_endian_platform() {
return SPARKEY_SUCCESS;
}
sparkey_returncode fread_little_endian32(FILE *fp, uint32_t *res) {
uint8_t data[4];
int count = fread(data, 4, 1, fp);
if (count < 1) {
return SPARKEY_UNEXPECTED_EOF;
}
*res = read_little_endian32(data, 0);
return SPARKEY_SUCCESS;
}
sparkey_returncode fread_little_endian64(FILE *fp, uint64_t *res) {
uint8_t data[8];
int count = fread(data, 8, 1, fp);
if (count < 1) {
return SPARKEY_UNEXPECTED_EOF;
}
*res = read_little_endian64(data, 0);
return SPARKEY_SUCCESS;
}

79
src/sparkey/endiantools.h Normal file
View File

@@ -0,0 +1,79 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#ifndef ENDIANTOOLS_H_INCLUDED
#define ENDIANTOOLS_H_INCLUDED
#include <stdio.h>
#include <unistd.h>
#include <stdint.h>
#include "sparkey.h"
typedef union {
uint32_t i;
uint8_t c[4];
} endian_union;
/**
* Writes count bytes of buf to a file with file descriptor fd
* @param fd file descriptor of a file to write to.
* @param buf bytes to write to file.
* Must point to a block of memory at least count long.
* @param count number of bytes to write.
* @returns SPARKEY_SUCCESS if all goes well, otherwise a sparkey error code.
*/
sparkey_returncode write_full(int fd, uint8_t *buf, size_t count);
/**
* Write a 32 bit value to buf in little endian.
* @param buf buf to write to. Must be at least 4 bytes long.
* @param value the value to write.
*/
void write_little_endian32(uint8_t *buf, uint32_t value);
/**
* Write a 32 bit value to file in little endian.
* @param fd file descriptor of file open for write.
* @param value the value to write.
* @returns SPARKEY_SUCCESS if all goes well, a sparkey error if the
* writing to file fails.
*/
sparkey_returncode fwrite_little_endian32(int fd, uint32_t value);
/**
* Write a 64 bit value to buf in little endian.
* @param buf buf to write to. Must be at least 8 bytes long.
* @param value the value to write.
*/
void write_little_endian64(uint8_t *buf, uint64_t value);
/**
* Write a 64 bit value to file in little endian.
* @param fd file descriptor of file open for write.
* @param value the value to write.
* @returns SPARKEY_SUCCESS if all goes well, a sparkey error if the
* writing to file fails.
*/
sparkey_returncode fwrite_little_endian64(int fd, uint64_t value);
uint32_t read_little_endian32(const uint8_t * array, uint64_t pos);
uint64_t read_little_endian64(const uint8_t * array, uint64_t pos);
sparkey_returncode correct_endian_platform();
sparkey_returncode fread_little_endian32(FILE *fp, uint32_t *res);
sparkey_returncode fread_little_endian64(FILE *fp, uint64_t *res);
#endif /* ENDIAN_H_INCLUDED */

View File

@@ -0,0 +1,50 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#include "hashalgorithms.h"
#include "MurmurHash3.h"
static uint64_t _read_little_endian32(const uint8_t *data, uint64_t pos) {
return read_little_endian32(data, pos);
}
static void _write_little_endian32(uint8_t *data, uint64_t hash) {
write_little_endian32(data, hash);
}
static sparkey_hash_algorithm murmurhash32 = {
&murmurhash32_hash,
&_read_little_endian32,
&_write_little_endian32
};
static sparkey_hash_algorithm murmurhash64 = {
&murmurhash64_hash,
&read_little_endian64,
&write_little_endian64
};
static sparkey_hash_algorithm invalid = {
NULL, NULL, NULL
};
sparkey_hash_algorithm sparkey_get_hash_algorithm(uint32_t hash_size) {
switch (hash_size) {
case 4: return murmurhash32;
case 8: return murmurhash64;
default: return invalid;
}
}

View File

@@ -0,0 +1,29 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#ifndef SPARKEY_HASHALGORITHM_H_INCLUDED
#define SPARKEY_HASHALGORITHM_H_INCLUDED
#include "endiantools.h"
typedef struct {
uint64_t (*hash)(const uint8_t *data, uint64_t len, uint32_t seed);
uint64_t (*read_hash)(const uint8_t *data, uint64_t pos);
void (*write_hash)(uint8_t *data, uint64_t hash);
} sparkey_hash_algorithm;
sparkey_hash_algorithm sparkey_get_hash_algorithm(uint32_t hash_size);
#endif

136
src/sparkey/hashheader.c Normal file
View File

@@ -0,0 +1,136 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#include <stdio.h>
#include <inttypes.h>
#include <string.h>
#include <errno.h>
#include "hashheader.h"
#include "endiantools.h"
#include "util.h"
#include "sparkey.h"
void print_hashheader(sparkey_hashheader *header) {
printf("Hash file version %d.%d\n", header->major_version, header->minor_version);
printf("Identifier: %08x\n", header->file_identifier);
printf("Max key size: %"PRIu64", Max value size: %"PRIu64"\n", header->max_key_len, header->max_value_len);
printf("Hash size: %d bit Murmurhash3\n", 8*header->hash_size);
printf("Num entries: %"PRIu64", Capacity: %"PRIu64"\n", header->num_entries, header->hash_capacity);
printf("Num collisions: %"PRIu64", Max displacement: %"PRIu64", Average displacement: %.2f\n", header->hash_collisions, header->max_displacement, (double) header->total_displacement / (double) header->num_entries);
printf("Data size: %"PRIu64", Garbage size: %"PRIu64"\n", header->data_end, header->garbage_size);
}
static sparkey_returncode hashheader_version0(sparkey_hashheader *header, FILE *fp) {
RETHROW(fread_little_endian32(fp, &header->file_identifier));
RETHROW(fread_little_endian32(fp, &header->hash_seed));
RETHROW(fread_little_endian64(fp, &header->data_end));
RETHROW(fread_little_endian64(fp, &header->max_key_len));
RETHROW(fread_little_endian64(fp, &header->max_value_len));
RETHROW(fread_little_endian64(fp, &header->num_puts));
RETHROW(fread_little_endian64(fp, &header->garbage_size));
RETHROW(fread_little_endian64(fp, &header->num_entries));
RETHROW(fread_little_endian32(fp, &header->address_size));
RETHROW(fread_little_endian32(fp, &header->hash_size));
RETHROW(fread_little_endian64(fp, &header->hash_capacity));
RETHROW(fread_little_endian64(fp, &header->max_displacement));
RETHROW(fread_little_endian32(fp, &header->entry_block_bits));
header->entry_block_bitmask = (1 << header->entry_block_bits) - 1;
RETHROW(fread_little_endian64(fp, &header->hash_collisions));
RETHROW(fread_little_endian64(fp, &header->total_displacement));
header->header_size = HASH_HEADER_SIZE;
header->hash_algorithm = sparkey_get_hash_algorithm(header->hash_size);
if (header->hash_algorithm.hash == NULL) {
return SPARKEY_HASH_HEADER_CORRUPT;
}
// Some basic consistency checks
if (header->num_entries > header->num_puts) {
return SPARKEY_HASH_HEADER_CORRUPT;
}
if (header->max_displacement > header->num_entries) {
return SPARKEY_HASH_HEADER_CORRUPT;
}
if (header->hash_collisions > header->num_entries) {
return SPARKEY_HASH_HEADER_CORRUPT;
}
return SPARKEY_SUCCESS;
}
typedef sparkey_returncode (*loader)(sparkey_hashheader *header, FILE *fp);
static loader loaders[2] = { hashheader_version0, hashheader_version0 };
sparkey_returncode sparkey_load_hashheader(sparkey_hashheader *header, const char *filename) {
FILE *fp = fopen(filename, "r");
if (fp == NULL) {
return sparkey_open_returncode(errno);
}
uint32_t tmp;
RETHROW(fread_little_endian32(fp, &tmp));
if (tmp != HASH_MAGIC_NUMBER) {
fclose(fp);
return SPARKEY_WRONG_HASH_MAGIC_NUMBER;
}
RETHROW(fread_little_endian32(fp, &header->major_version));
if (header->major_version != HASH_MAJOR_VERSION) {
fclose(fp);
return SPARKEY_WRONG_HASH_MAJOR_VERSION;
}
RETHROW(fread_little_endian32(fp, &header->minor_version));
if (header->minor_version > HASH_MINOR_VERSION) {
fclose(fp);
return SPARKEY_UNSUPPORTED_HASH_MINOR_VERSION;
}
int version = header->minor_version;
loader l = loaders[version];
if (l == NULL) {
fclose(fp);
return SPARKEY_INTERNAL_ERROR;
}
sparkey_returncode x = (*l)(header, fp);
fclose(fp);
return x;
}
sparkey_returncode write_hashheader(int fd, sparkey_hashheader *header) {
RETHROW(fwrite_little_endian32(fd, HASH_MAGIC_NUMBER));
RETHROW(fwrite_little_endian32(fd, HASH_MAJOR_VERSION));
RETHROW(fwrite_little_endian32(fd, HASH_MINOR_VERSION));
RETHROW(fwrite_little_endian32(fd, header->file_identifier));
RETHROW(fwrite_little_endian32(fd, header->hash_seed));
RETHROW(fwrite_little_endian64(fd, header->data_end));
RETHROW(fwrite_little_endian64(fd, header->max_key_len));
RETHROW(fwrite_little_endian64(fd, header->max_value_len));
RETHROW(fwrite_little_endian64(fd, header->num_puts));
RETHROW(fwrite_little_endian64(fd, header->garbage_size));
RETHROW(fwrite_little_endian64(fd, header->num_entries));
RETHROW(fwrite_little_endian32(fd, header->address_size));
RETHROW(fwrite_little_endian32(fd, header->hash_size));
RETHROW(fwrite_little_endian64(fd, header->hash_capacity));
RETHROW(fwrite_little_endian64(fd, header->max_displacement));
RETHROW(fwrite_little_endian32(fd, header->entry_block_bits));
RETHROW(fwrite_little_endian64(fd, header->hash_collisions));
RETHROW(fwrite_little_endian64(fd, header->total_displacement));
return SPARKEY_SUCCESS;
}

99
src/sparkey/hashheader.h Normal file
View File

@@ -0,0 +1,99 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#ifndef SPARKEY_HASHHEADER_H_INCLUDED
#define SPARKEY_HASHHEADER_H_INCLUDED
#include <stdint.h>
#include "endiantools.h"
#include "sparkey.h"
#include "hashalgorithms.h"
#define HASH_MAGIC_NUMBER (0x9a11318f)
#define HASH_MAJOR_VERSION (1)
#define HASH_MINOR_VERSION (1)
#define HASH_HEADER_SIZE (112)
typedef struct {
uint32_t major_version;
uint32_t minor_version;
uint32_t file_identifier;
uint32_t hash_seed;
uint32_t header_size;
uint64_t data_end;
uint64_t max_key_len;
uint64_t max_value_len;
uint64_t garbage_size;
uint64_t num_entries;
uint32_t address_size;
uint32_t hash_size;
uint64_t hash_capacity;
uint64_t max_displacement;
uint64_t num_puts;
uint32_t entry_block_bits;
uint32_t entry_block_bitmask;
uint64_t hash_collisions;
uint64_t total_displacement;
sparkey_hash_algorithm hash_algorithm;
} sparkey_hashheader;
/**
* fills up a hashheader struct based on the contents at the beginning of the file.
* @param header header struct to fill
* @param filename a hash file
* @returns an error code if it could not load the file.
*/
sparkey_returncode sparkey_load_hashheader(sparkey_hashheader *header, const char *filename);
/**
* Dumps a human readable representation of the header to stdout
* @param header an initialized header struct
*/
void print_hashheader(sparkey_hashheader *header);
/**
* Writes a header to the current position in the file
* @param fd a file descripter pointing to a file open for writing
* @param header the header to write
* @returns an error code if it could not write to file.
*/
sparkey_returncode write_hashheader(int fd, sparkey_hashheader *header);
static inline uint64_t get_displacement(uint64_t capacity, uint64_t slot, uint64_t hash) {
uint64_t wanted_slot = hash % capacity;
return (capacity + (slot - wanted_slot)) % capacity;
}
static inline uint64_t read_addr(uint8_t *hashtable, uint64_t pos, int address_size) {
switch (address_size) {
case 4: return read_little_endian32(hashtable, pos);
case 8: return read_little_endian64(hashtable, pos);
}
return -1;
}
static inline void write_addr(uint8_t *buf, uint64_t value, int address_size) {
switch (address_size) {
case 4: write_little_endian32(buf, value); return;
case 8: write_little_endian64(buf, value); return;
}
}
#endif

46
src/sparkey/hashiter.c Normal file
View File

@@ -0,0 +1,46 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#include <stdlib.h>
#include <string.h>
#include "sparkey.h"
#include "sparkey-internal.h"
#include "hashiter.h"
uint64_t sparkey_iter_hash(sparkey_hashheader *hash_header, sparkey_logiter *iter, sparkey_logreader *log) {
uint8_t *buf;
uint64_t len;
sparkey_returncode returncode = sparkey_logiter_keychunk(iter, log, 1 << 31, &buf, &len);
if (returncode != SPARKEY_SUCCESS) {
return 0;
}
if (len == iter->keylen) {
return hash_header->hash_algorithm.hash(buf, len, hash_header->hash_seed);
} else {
uint8_t *keybuf = malloc(iter->keylen);
memcpy(keybuf, buf, len);
uint64_t len2;
returncode = sparkey_logiter_fill_key(iter, log, 1 << 31, keybuf + len, &len2);
if (len + len2 != iter->keylen) {
free(keybuf);
return 0;
}
uint64_t hash = hash_header->hash_algorithm.hash(keybuf, iter->keylen, hash_header->hash_seed);
free(keybuf);
return hash;
}
}

26
src/sparkey/hashiter.h Normal file
View File

@@ -0,0 +1,26 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#ifndef SPARKEY_HASHITER_H_INCLUDED
#define SPARKEY_HASHITER_H_INCLUDED
#include "sparkey.h"
#include "hashheader.h"
uint64_t sparkey_iter_hash(sparkey_hashheader *hash_header, sparkey_logiter *iter, sparkey_logreader *log);
#endif

255
src/sparkey/hashreader.c Normal file
View File

@@ -0,0 +1,255 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>
#include "hashheader.h"
#include "hashiter.h"
#include "util.h"
#include "endiantools.h"
#include "sparkey.h"
#include "sparkey-internal.h"
#define MAGIC_VALUE_HASHREADER (0x75103df9)
sparkey_returncode sparkey_hash_open(sparkey_hashreader **reader_ref, const char *hash_filename, const char *log_filename) {
RETHROW(correct_endian_platform());
sparkey_returncode returncode;
sparkey_hashreader *reader = malloc(sizeof(sparkey_hashreader));
if (reader == NULL) {
return SPARKEY_INTERNAL_ERROR;
}
TRY(sparkey_load_hashheader(&reader->header, hash_filename), free_reader);
TRY(sparkey_logreader_open_noalloc(&reader->log, log_filename), free_reader);
if (reader->header.file_identifier != reader->log.header.file_identifier) {
returncode = SPARKEY_FILE_IDENTIFIER_MISMATCH;
goto close_reader;
}
if (reader->header.data_end > reader->log.header.data_end) {
returncode = SPARKEY_HASH_HEADER_CORRUPT;
goto close_reader;
}
if (reader->header.max_key_len > reader->log.header.max_key_len) {
returncode = SPARKEY_HASH_HEADER_CORRUPT;
goto close_reader;
}
if (reader->header.max_value_len > reader->log.header.max_value_len) {
returncode = SPARKEY_HASH_HEADER_CORRUPT;
goto close_reader;
}
reader->fd = open(hash_filename, O_RDONLY);
if (reader->fd < 0) {
int e = errno;
returncode = sparkey_open_returncode(e);
goto close_reader;
}
reader->data_len = reader->header.header_size + reader->header.hash_capacity * (reader->header.hash_size + reader->header.address_size);
struct stat s;
stat(hash_filename, &s);
if (reader->data_len > (uint64_t) s.st_size) {
returncode = SPARKEY_HASH_TOO_SMALL;
goto close_reader;
}
reader->data = mmap(NULL, reader->data_len, PROT_READ, MAP_SHARED, reader->fd, 0);
if (reader->data == MAP_FAILED) {
returncode = SPARKEY_MMAP_FAILED;
goto close_reader;
}
*reader_ref = reader;
reader->open_status = MAGIC_VALUE_HASHREADER;
return SPARKEY_SUCCESS;
close_reader:
sparkey_hash_close(&reader);
return returncode;
free_reader:
free(reader);
return returncode;
}
void sparkey_hash_close(sparkey_hashreader **reader_ref) {
if (reader_ref == NULL) {
return;
}
sparkey_hashreader *reader = *reader_ref;
if (reader == NULL) {
return;
}
if (reader->open_status != MAGIC_VALUE_HASHREADER) {
return;
}
sparkey_logreader_close_nodealloc(&reader->log);
reader->open_status = 0;
if (reader->data != NULL) {
munmap(reader->data, reader->data_len);
reader->data = NULL;
}
close(reader->fd);
reader->fd = -1;
free(reader);
*reader_ref = NULL;
}
static sparkey_returncode assert_reader_open(sparkey_hashreader *reader) {
if (reader->open_status != MAGIC_VALUE_HASHREADER) {
return SPARKEY_HASH_CLOSED;
}
return SPARKEY_SUCCESS;
}
sparkey_returncode sparkey_hash_get(sparkey_hashreader *reader, const uint8_t *key, uint64_t keylen, sparkey_logiter *iter) {
RETHROW(assert_reader_open(reader));
uint64_t hash = reader->header.hash_algorithm.hash(key, keylen, reader->header.hash_seed);
uint64_t wanted_slot = hash % reader->header.hash_capacity;
int slot_size = reader->header.address_size + reader->header.hash_size;
uint64_t pos = wanted_slot * slot_size;
uint64_t displacement = 0;
uint64_t slot = wanted_slot;
uint8_t *hashtable = reader->data + reader->header.header_size;
while (1) {
uint64_t hash2 = reader->header.hash_algorithm.read_hash(hashtable, pos);
uint64_t position2 = read_addr(hashtable, pos + reader->header.hash_size, reader->header.address_size);
if (position2 == 0) {
iter->state = SPARKEY_ITER_INVALID;
return SPARKEY_SUCCESS;
}
int entry_index2 = (int) (position2) & reader->header.entry_block_bitmask;
position2 >>= reader->header.entry_block_bits;
if (hash == hash2) {
RETHROW(sparkey_logiter_seek(iter, &reader->log, position2));
RETHROW(sparkey_logiter_skip(iter, &reader->log, entry_index2));
RETHROW(sparkey_logiter_next(iter, &reader->log));
uint64_t keylen2 = iter->keylen;
if (iter->type != SPARKEY_ENTRY_PUT) {
iter->state = SPARKEY_ITER_INVALID;
return SPARKEY_INTERNAL_ERROR;
}
if (keylen == keylen2) {
uint64_t pos2 = 0;
int equals = 1;
while (pos2 < keylen) {
uint8_t *buf2;
uint64_t len2;
RETHROW(sparkey_logiter_keychunk(iter, &reader->log, keylen, &buf2, &len2));
if (memcmp(&key[pos2], buf2, len2) != 0) {
equals = 0;
break;
}
pos2 += len2;
}
if (equals) {
return SPARKEY_SUCCESS;
}
}
}
uint64_t other_displacement = get_displacement(reader->header.hash_capacity, slot, hash2);
if (displacement > other_displacement) {
iter->state = SPARKEY_ITER_INVALID;
return SPARKEY_SUCCESS;
}
pos += slot_size;
displacement++;
slot++;
if (slot >= reader->header.hash_capacity) {
pos = 0;
slot = 0;
}
}
iter->state = SPARKEY_ITER_INVALID;
return SPARKEY_INTERNAL_ERROR;
}
sparkey_returncode sparkey_logiter_hashnext(sparkey_logiter *iter, sparkey_hashreader *reader) {
RETHROW(assert_reader_open(reader));
uint8_t *hashtable = reader->data + reader->header.header_size;
int slot_size = reader->header.address_size + reader->header.hash_size;
while (1) {
RETHROW(sparkey_logiter_next(iter, &reader->log));
if (iter->state != SPARKEY_ITER_ACTIVE) {
return SPARKEY_SUCCESS;
}
if (iter->type != SPARKEY_ENTRY_PUT) {
continue;
}
uint64_t position = (iter->entry_block_position << reader->header.entry_block_bits) | iter->entry_count;
uint64_t key_hash = sparkey_iter_hash(&reader->header, iter, &reader->log);
uint64_t wanted_slot = key_hash % reader->header.hash_capacity;
uint64_t pos = wanted_slot * slot_size;
uint64_t displacement = 0;
uint64_t slot = wanted_slot;
while (1) {
uint64_t hash2 = reader->header.hash_algorithm.read_hash(hashtable, pos);
uint64_t position2 = read_addr(hashtable, pos + reader->header.hash_size, reader->header.address_size);
if (position2 == 0) {
break;
}
if (position == position2) {
// Found a match! Just reset the iterator
RETHROW(sparkey_logiter_reset(iter, &reader->log));
return SPARKEY_SUCCESS;
}
uint64_t other_displacement = get_displacement(reader->header.hash_capacity, slot, hash2);
if (displacement > other_displacement) {
break;
}
pos += slot_size;
displacement++;
slot++;
if (slot >= reader->header.hash_capacity) {
pos = 0;
slot = 0;
}
}
}
}
sparkey_logreader * sparkey_hash_getreader(sparkey_hashreader *reader) {
return &reader->log;
}
uint64_t sparkey_hash_numentries(sparkey_hashreader *reader) {
return reader->header.num_entries;
}
uint64_t sparkey_hash_numcollisions(sparkey_hashreader *reader) {
return reader->header.hash_collisions;
}

521
src/sparkey/hashwriter.c Normal file
View File

@@ -0,0 +1,521 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#include <string.h>
#include <errno.h>
#include <inttypes.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>
#include "sparkey.h"
#include "sparkey-internal.h"
#include "logheader.h"
#include "endiantools.h"
#include "util.h"
#include "hashheader.h"
#include "hashiter.h"
static uint32_t int_log2(uint32_t x) {
uint32_t count = 0;
while (x > 0) {
x >>= 1;
count++;
}
return count;
}
static int unsigned_vlq_size(uint64_t value) {
if (value < 1ULL << 7ULL) {
return 1;
}
if (value < 1ULL << 14ULL) {
return 2;
}
if (value < 1ULL << 21ULL) {
return 3;
}
if (value < 1ULL << 28ULL) {
return 4;
}
if (value < 1ULL << 35ULL) {
return 5;
}
if (value < 1ULL << 42ULL) {
return 6;
}
if (value < 1ULL << 49ULL) {
return 7;
}
if (value < 1ULL << 56ULL) {
return 8;
}
if (value < 1ULL << 63ULL) {
return 9;
}
return 10;
}
static void added_entry(sparkey_hashheader *hash_header) {
hash_header->num_entries++;
}
static void replaced_entry(sparkey_hashheader *hash_header, uint64_t keylen, uint64_t valuelen) {
hash_header->garbage_size += keylen + valuelen + unsigned_vlq_size(keylen + 1) + unsigned_vlq_size(valuelen);
}
static void deleted_entry(sparkey_hashheader *hash_header, uint64_t keylen, uint64_t valuelen) {
hash_header->garbage_size += keylen + valuelen + unsigned_vlq_size(keylen + 1) + unsigned_vlq_size(valuelen);
hash_header->num_entries--;
}
static sparkey_returncode hash_delete(uint64_t wanted_slot, uint64_t hash, uint8_t *hashtable, sparkey_hashheader *hash_header, sparkey_logiter *iter, sparkey_logiter *ra_iter, sparkey_logreader *log) {
int slot_size = hash_header->address_size + hash_header->hash_size;
uint64_t pos = wanted_slot * slot_size;
uint64_t displacement = 0;
uint64_t slot = wanted_slot;
while (1) {
uint64_t hash2 = hash_header->hash_algorithm.read_hash(hashtable, pos);
uint64_t position2 = read_addr(hashtable, pos + hash_header->hash_size, hash_header->address_size);
if (position2 == 0) {
return SPARKEY_SUCCESS;
}
int entry_index2 = (int) (position2) & hash_header->entry_block_bitmask;
position2 >>= hash_header->entry_block_bits;
if (position2 < log->header.header_size || position2 >= log->header.data_end ) {
fprintf(stderr, "hash_delete():%d bug: found pointer outside of range %"PRIu64"\n", __LINE__, position2);
return SPARKEY_INTERNAL_ERROR;
}
if (hash == hash2) {
RETHROW(sparkey_logiter_seek(ra_iter, log, position2));
RETHROW(sparkey_logiter_skip(ra_iter, log, entry_index2));
RETHROW(sparkey_logiter_next(ra_iter, log));
uint64_t keylen2 = ra_iter->keylen;
uint64_t valuelen2 = ra_iter->valuelen;
if (ra_iter->type != SPARKEY_ENTRY_PUT) {
fprintf(stderr, "hash_delete():%d bug: expected a put entry but found %d\n", __LINE__, ra_iter->type);
return SPARKEY_INTERNAL_ERROR;
}
if (iter->keylen == keylen2) {
RETHROW(sparkey_logiter_reset(iter, log));
int cmp;
RETHROW(sparkey_logiter_keycmp(iter, ra_iter, log, &cmp));
if (cmp == 0) {
// TODO: possibly optimize this to read and write stuff to move in chunks instead of one by one, to decrease number of seeks.
while (1) {
uint64_t next_slot = (slot + 1) % hash_header->hash_capacity;
uint64_t next_pos = next_slot * slot_size;
uint64_t hash3 = hash_header->hash_algorithm.read_hash(hashtable, next_pos);
uint64_t position3 = read_addr(hashtable, next_pos + hash_header->hash_size, hash_header->address_size);
if (position3 == 0) {
break;
}
if ((hash3 % hash_header->hash_capacity) == next_slot) {
break;
}
uint64_t pos3 = slot * slot_size;
hash_header->hash_algorithm.write_hash(&hashtable[pos3], hash3);
write_addr(&hashtable[pos3 + hash_header->hash_size], position3, hash_header->address_size);
slot = next_slot;
}
uint64_t pos3 = slot * slot_size;
hash_header->hash_algorithm.write_hash(&hashtable[pos3], 0);
write_addr(&hashtable[pos3 + hash_header->hash_size], 0, hash_header->address_size);
deleted_entry(hash_header, keylen2, valuelen2);
return SPARKEY_SUCCESS;
}
}
}
uint64_t other_displacement = get_displacement(hash_header->hash_capacity, slot, hash2);
if (displacement > other_displacement) {
return SPARKEY_SUCCESS;
}
pos += slot_size;
displacement++;
slot++;
if (slot >= hash_header->hash_capacity) {
pos = 0;
slot = 0;
}
}
fprintf(stderr, "hash_put():%d bug: unreachable statement\n", __LINE__);
return SPARKEY_INTERNAL_ERROR;
}
static sparkey_returncode hash_put(uint64_t wanted_slot, uint64_t hash, uint8_t *hashtable, sparkey_hashheader *hash_header, sparkey_logiter *iter, sparkey_logiter *ra_iter, sparkey_logreader *log, uint64_t position) {
int slot_size = hash_header->address_size + hash_header->hash_size;
uint64_t pos = wanted_slot * slot_size;
uint64_t displacement = 0;
uint64_t slot = wanted_slot;
int might_be_collision = iter != NULL && ra_iter != NULL && log != NULL;
while (1) {
uint64_t hash2 = hash_header->hash_algorithm.read_hash(hashtable, pos);
uint64_t position2 = read_addr(hashtable, pos + hash_header->hash_size, hash_header->address_size);
if (position2 == 0) {
hash_header->hash_algorithm.write_hash(&hashtable[pos], hash);
write_addr(&hashtable[pos + hash_header->hash_size], position, hash_header->address_size);
added_entry(hash_header);
return SPARKEY_SUCCESS;
}
int entry_index2 = (int) (position2) & hash_header->entry_block_bitmask;
uint64_t position3 = position2 >> hash_header->entry_block_bits;
if (might_be_collision && hash == hash2) {
RETHROW(sparkey_logiter_seek(ra_iter, log, position3));
RETHROW(sparkey_logiter_skip(ra_iter, log, entry_index2));
RETHROW(sparkey_logiter_next(ra_iter, log));
uint64_t keylen2 = ra_iter->keylen;
uint64_t valuelen2 = ra_iter->valuelen;
if (ra_iter->type != SPARKEY_ENTRY_PUT) {
fprintf(stderr, "hash_put():%d bug: expected a put entry but found %d\n", __LINE__, ra_iter->type);
return SPARKEY_INTERNAL_ERROR;
}
if (iter->keylen == keylen2) {
RETHROW(sparkey_logiter_reset(iter, log));
int cmp;
RETHROW(sparkey_logiter_keycmp(iter, ra_iter, log, &cmp));
if (cmp == 0) {
hash_header->hash_algorithm.write_hash(&hashtable[pos], hash);
write_addr(&hashtable[pos + hash_header->hash_size], position, hash_header->address_size);
replaced_entry(hash_header, keylen2, valuelen2);
return SPARKEY_SUCCESS;
}
}
}
uint64_t other_displacement = get_displacement(hash_header->hash_capacity, slot, hash2);
if (displacement > other_displacement) {
// Steal the slot, and move the other one
hash_header->hash_algorithm.write_hash(&hashtable[pos], hash);
write_addr(&hashtable[pos + hash_header->hash_size], position, hash_header->address_size);
position = position2;
displacement = other_displacement;
hash = hash2;
might_be_collision = 0;
}
pos += slot_size;
displacement++;
slot++;
if (slot >= hash_header->hash_capacity) {
pos = 0;
slot = 0;
}
}
fprintf(stderr, "hash_put():%d bug: unreachable statement\n", __LINE__);
return SPARKEY_INTERNAL_ERROR;
}
static void calculate_max_displacement(sparkey_hashheader *hash_header, uint8_t *hashtable) {
uint64_t capacity = hash_header->hash_capacity;
int hash_size = hash_header->hash_size;
int slot_size = hash_header->address_size + hash_size;
uint64_t max_displacement = 0;
uint64_t num_hash_collisions = 0;
uint64_t total_displacement = 0;
int has_first = 0;
uint64_t first_hash = 0;
int has_last = 0;
uint64_t last_hash = 0;
int has_prev = 0;
uint64_t prev_hash = -1;
for (uint64_t slot = 0; slot < capacity; slot++) {
uint64_t hash = hash_header->hash_algorithm.read_hash(hashtable, slot * slot_size);
if (has_prev && prev_hash == hash) {
num_hash_collisions++;
}
uint64_t position = read_addr(hashtable, slot * slot_size + hash_size, hash_header->address_size);
if (position != 0) {
prev_hash = hash;
has_prev = 1;
uint64_t displacement = get_displacement(capacity, slot, hash);
total_displacement += displacement;
if (displacement > max_displacement) {
max_displacement = displacement;
}
if (slot == 0) {
first_hash = hash;
has_first = 1;
}
if (slot == capacity - 1) {
last_hash = hash;
has_last = 1;
}
} else {
has_prev = 0;
}
}
if (has_first && has_last && first_hash == last_hash) {
num_hash_collisions++;
}
hash_header->total_displacement = total_displacement;
hash_header->max_displacement = max_displacement;
hash_header->hash_collisions = num_hash_collisions;
}
static sparkey_returncode read_fully(int fd, uint8_t *buf, size_t count) {
while (count > 0) {
ssize_t actual_read = read(fd, buf, count);
if (actual_read < 0) {
fprintf(stderr, "read_fully():%d bug: actual_read = %"PRIu64", errno = %d\n", __LINE__, (uint64_t)actual_read, errno);
return SPARKEY_INTERNAL_ERROR;
}
count -= actual_read;
}
return SPARKEY_SUCCESS;
}
static sparkey_returncode hash_copy(uint8_t *hashtable, uint8_t *buf, size_t buffer_size, sparkey_hashheader *old_header, sparkey_hashheader *new_header) {
int slot_size = old_header->address_size + old_header->hash_size;
for (unsigned int i = 0; i < buffer_size; i += slot_size) {
uint64_t hash = old_header->hash_algorithm.read_hash(buf, i);
uint64_t position = read_addr(buf, i + old_header->hash_size, old_header->address_size);
int entry_index = (int) (position) & old_header->entry_block_bitmask;
position >>= old_header->entry_block_bits;
uint64_t wanted_slot = hash % new_header->hash_capacity;
if (position != 0) {
RETHROW(hash_put(wanted_slot, hash, hashtable, new_header, NULL, NULL, NULL, (position << new_header->entry_block_bits) | entry_index));
}
}
return SPARKEY_SUCCESS;
}
static sparkey_returncode fill_hash(uint8_t *hashtable, const char *hash_filename, sparkey_hashheader *old_header, sparkey_hashheader *new_header) {
int fd = open(hash_filename, O_RDONLY);
if (fd < 0) {
return sparkey_open_returncode(errno);
}
lseek(fd, old_header->header_size, SEEK_SET);
int slot_size = old_header->address_size + old_header->hash_size;
uint64_t buffer_size = slot_size * 1024;
uint8_t *buf = malloc(buffer_size);
if (buf == NULL) {
fprintf(stderr, "fill_hash():%d bug: could not malloc %"PRIu64" bytes\n", __LINE__, buffer_size);
return SPARKEY_INTERNAL_ERROR;
}
sparkey_returncode returncode = SPARKEY_SUCCESS;
uint64_t remaining = old_header->hash_capacity * slot_size;
while (buffer_size <= remaining) {
TRY(read_fully(fd, buf, buffer_size), free);
TRY(hash_copy(hashtable, buf, buffer_size, old_header, new_header), free);
remaining -= buffer_size;
}
TRY(read_fully(fd, buf, remaining), free);
TRY(hash_copy(hashtable, buf, remaining, old_header, new_header), free);
free:
free(buf);
if (close(fd) < 0) {
if (returncode == SPARKEY_SUCCESS) {
fprintf(stderr, "fill_hash():%d bug: could not close file. errno = %d\n", __LINE__, errno);
returncode = SPARKEY_INTERNAL_ERROR;
}
}
return returncode;
}
sparkey_returncode sparkey_hash_write(const char *hash_filename, const char *log_filename, int hash_size) {
sparkey_logheader log_header;
sparkey_logreader *log;
sparkey_logiter *iter = NULL;
sparkey_logiter *ra_iter = NULL;
RETHROW(sparkey_load_logheader(&log_header, log_filename));
RETHROW(sparkey_logreader_open(&log, log_filename));
sparkey_returncode returncode = SPARKEY_SUCCESS;
TRY(sparkey_logiter_create(&iter, log), close_reader);
TRY(sparkey_logiter_create(&ra_iter, log), close_iter);
sparkey_hashheader hash_header;
sparkey_hashheader old_header;
double cap;
uint64_t start;
uint32_t hash_seed;
int copy_old;
uint32_t old_hash_size = 0;
returncode = sparkey_load_hashheader(&old_header, hash_filename);
if (returncode == SPARKEY_SUCCESS &&
old_header.file_identifier == log_header.file_identifier &&
old_header.major_version == HASH_MAJOR_VERSION &&
old_header.minor_version == HASH_MINOR_VERSION) {
// Prepare to copy stuff from old header
cap = ((log_header.num_puts - old_header.num_puts) + old_header.num_entries) * 1.3;
start = old_header.data_end;
hash_seed = old_header.hash_seed;
hash_header.garbage_size = old_header.garbage_size;
copy_old = 1;
old_hash_size = old_header.hash_size;
} else {
cap = log_header.num_puts * 1.3;
start = log_header.header_size;
TRY(rand32(&hash_seed), close_iter);
hash_header.garbage_size = 0;
copy_old = 0;
returncode = SPARKEY_SUCCESS;
}
hash_header.hash_capacity = 1 | (uint64_t) cap;
hash_header.hash_seed = hash_seed;
hash_header.max_key_len = log_header.max_key_len;
hash_header.max_value_len = log_header.max_value_len;
hash_header.data_end = log_header.data_end;
hash_header.num_puts = log_header.num_puts;
hash_header.entry_block_bits = int_log2(log_header.max_entries_per_block);
hash_header.entry_block_bitmask = (1 << hash_header.entry_block_bits) - 1;
if (hash_header.data_end < (1ULL << (32 - hash_header.entry_block_bits))) {
hash_header.address_size = 4;
} else {
hash_header.address_size = 8;
}
if (old_hash_size == 8 || hash_header.hash_capacity >= (1 << 23)) {
hash_header.hash_size = 8;
} else {
hash_header.hash_size = 4;
}
if (hash_size != 0) {
if (hash_size == 4 || hash_size == 8) {
hash_header.hash_size = hash_size;
} else {
returncode = SPARKEY_HASH_SIZE_INVALID;
goto close_iter;
}
}
if (hash_header.hash_size != old_hash_size) {
copy_old = 0;
}
hash_header.hash_algorithm = sparkey_get_hash_algorithm(hash_header.hash_size);
int slot_size = hash_header.hash_size + hash_header.address_size;
uint64_t hashsize = slot_size * hash_header.hash_capacity;
uint8_t *hashtable = malloc(hashsize);
if (hashtable == NULL) {
fprintf(stderr, "sparkey_hash_write():%d bug: could not malloc %"PRIu64" bytes\n", __LINE__, hashsize);
returncode = SPARKEY_INTERNAL_ERROR;
goto close_iter;
}
memset(hashtable, 0, hashsize);
hash_header.max_displacement = 0;
hash_header.total_displacement = 0;
hash_header.num_entries = 0;
hash_header.hash_collisions = 0;
if (copy_old) {
if (old_header.data_end == log->header.data_end) {
// Nothing needs to be done - just exit
goto close_iter;
}
TRY(fill_hash(hashtable, hash_filename, &old_header, &hash_header), free_hashtable);
TRY(sparkey_logiter_seek(iter, log, start), free_hashtable);
}
while (1) {
TRY(sparkey_logiter_next(iter, log), free_hashtable);
switch (iter->state) {
case SPARKEY_ITER_CLOSED:
goto normal_exit;
break;
case SPARKEY_ITER_ACTIVE:
break;
default:
fprintf(stderr, "sparkey_hash_write():%d bug: invalid iter state: %d\n", __LINE__, iter->state);
returncode = SPARKEY_INTERNAL_ERROR;
goto free_hashtable;
break;
}
uint64_t iter_block_start = iter->block_position;
uint64_t iter_entry_count = iter->entry_count;
uint64_t key_hash = sparkey_iter_hash(&hash_header, iter, log);
uint64_t wanted_slot = key_hash % hash_header.hash_capacity;
switch (iter->type) {
case SPARKEY_ENTRY_PUT:
TRY(hash_put(wanted_slot, key_hash, hashtable, &hash_header, iter, ra_iter, log, (iter_block_start << hash_header.entry_block_bits) | iter_entry_count), free_hashtable);
break;
case SPARKEY_ENTRY_DELETE:
hash_header.garbage_size += 1 + unsigned_vlq_size(iter->keylen) + iter->keylen;
TRY(hash_delete(wanted_slot, key_hash, hashtable, &hash_header, iter, ra_iter, log), free_hashtable);
break;
}
}
normal_exit:
calculate_max_displacement(&hash_header, hashtable);
// Try removing it first, to avoid overwriting existing files that readers may be using.
if (remove(hash_filename) < 0) {
int e = errno;
if (e != ENOENT) {
returncode = sparkey_remove_returncode(e);
goto free_hashtable;
}
}
int fd = creat(hash_filename, 00644);
hash_header.major_version = HASH_MAJOR_VERSION;
hash_header.minor_version = HASH_MINOR_VERSION;
hash_header.file_identifier = log_header.file_identifier;
hash_header.data_end = log_header.data_end;
TRY(write_hashheader(fd, &hash_header), close_hash);
TRY(write_full(fd, hashtable, hashsize), close_hash);
close_hash:
close(fd);
free_hashtable:
free(hashtable);
close_iter:
sparkey_logiter_close(&iter);
sparkey_logiter_close(&ra_iter);
close_reader:
sparkey_logreader_close(&log);
return returncode;
}

124
src/sparkey/logheader.c Normal file
View File

@@ -0,0 +1,124 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#include <stdio.h>
#include <inttypes.h>
#include <string.h>
#include <errno.h>
#include "logheader.h"
#include "endiantools.h"
#include "util.h"
static char * compression_types[] = { "Uncompressed", "Snappy", NULL };
void print_logheader(sparkey_logheader *header) {
printf("Log file version %d.%d\n", header->major_version,
header->minor_version);
printf("Identifier: %08x\n", header->file_identifier);
printf("Puts: %"PRIu64", Deletes: %"PRIu64"\n", header->num_puts, header->num_deletes);
printf("Max key size: %"PRIu64", Max value size: %"PRIu64"\n", header->max_key_len, header->max_value_len);
printf("Compression: %s, block size: %d\n",
compression_types[header->compression_type],
header->compression_block_size);
}
static sparkey_returncode logheader_version0(sparkey_logheader *header, FILE *fp) {
RETHROW(fread_little_endian32(fp, &header->file_identifier));
RETHROW(fread_little_endian64(fp, &header->num_puts));
RETHROW(fread_little_endian64(fp, &header->num_deletes));
RETHROW(fread_little_endian64(fp, &header->data_end));
RETHROW(fread_little_endian64(fp, &header->max_key_len));
RETHROW(fread_little_endian64(fp, &header->max_value_len));
RETHROW(fread_little_endian64(fp, &header->delete_size));
RETHROW(fread_little_endian32(fp, &header->compression_type));
RETHROW(fread_little_endian32(fp, &header->compression_block_size));
RETHROW(fread_little_endian64(fp, &header->put_size));
RETHROW(fread_little_endian32(fp, &header->max_entries_per_block));
header->header_size = LOG_HEADER_SIZE;
// Some basic consistency checks
if (header->data_end < header->header_size) {
return SPARKEY_LOG_HEADER_CORRUPT;
}
if (header->num_puts > header->data_end) {
return SPARKEY_LOG_HEADER_CORRUPT;
}
if (header->num_deletes > header->data_end) {
return SPARKEY_LOG_HEADER_CORRUPT;
}
if (header->compression_type > SPARKEY_COMPRESSION_SNAPPY) {
return SPARKEY_LOG_HEADER_CORRUPT;
}
return SPARKEY_SUCCESS;
}
typedef sparkey_returncode (*loader)(sparkey_logheader *header, FILE *fp);
static loader loaders[1] = { logheader_version0 };
sparkey_returncode sparkey_load_logheader(sparkey_logheader *header, const char *filename) {
FILE *fp = fopen(filename, "r");
if (fp == NULL) {
return sparkey_open_returncode(errno);
}
uint32_t tmp;
RETHROW(fread_little_endian32(fp, &tmp));
if (tmp != LOG_MAGIC_NUMBER) {
fclose(fp);
return SPARKEY_WRONG_LOG_MAGIC_NUMBER;
}
RETHROW(fread_little_endian32(fp, &header->major_version));
if (header->major_version != LOG_MAJOR_VERSION) {
fclose(fp);
return SPARKEY_WRONG_LOG_MAJOR_VERSION;
}
RETHROW(fread_little_endian32(fp, &header->minor_version));
if (header->minor_version > LOG_MINOR_VERSION) {
fclose(fp);
return SPARKEY_UNSUPPORTED_LOG_MINOR_VERSION;
}
int version = header->minor_version;
loader l = loaders[version];
if (l == NULL) {
fclose(fp);
return SPARKEY_INTERNAL_ERROR;
}
sparkey_returncode x = (*l)(header, fp);
fclose(fp);
return x;
}
sparkey_returncode write_logheader(int fd, sparkey_logheader *header) {
RETHROW(fwrite_little_endian32(fd, LOG_MAGIC_NUMBER));
RETHROW(fwrite_little_endian32(fd, LOG_MAJOR_VERSION));
RETHROW(fwrite_little_endian32(fd, LOG_MINOR_VERSION));
RETHROW(fwrite_little_endian32(fd, header->file_identifier));
RETHROW(fwrite_little_endian64(fd, header->num_puts));
RETHROW(fwrite_little_endian64(fd, header->num_deletes));
RETHROW(fwrite_little_endian64(fd, header->data_end));
RETHROW(fwrite_little_endian64(fd, header->max_key_len));
RETHROW(fwrite_little_endian64(fd, header->max_value_len));
RETHROW(fwrite_little_endian64(fd, header->delete_size));
RETHROW(fwrite_little_endian32(fd, header->compression_type));
RETHROW(fwrite_little_endian32(fd, header->compression_block_size));
RETHROW(fwrite_little_endian64(fd, header->put_size));
RETHROW(fwrite_little_endian32(fd, header->max_entries_per_block));
return SPARKEY_SUCCESS;
}

68
src/sparkey/logheader.h Normal file
View File

@@ -0,0 +1,68 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#ifndef SPARKEY_LOGHEADER_H_INCLUDED
#define SPARKEY_LOGHEADER_H_INCLUDED
#include <stdint.h>
#include "sparkey.h"
#define LOG_MAGIC_NUMBER (0x49b39c95)
#define LOG_MAJOR_VERSION (1)
#define LOG_MINOR_VERSION (0)
#define LOG_HEADER_SIZE (84)
typedef struct {
uint32_t major_version;
uint32_t minor_version;
uint32_t file_identifier;
uint64_t num_puts;
uint64_t num_deletes;
uint64_t data_end;
uint64_t max_key_len;
uint64_t max_value_len;
uint64_t delete_size;
sparkey_compression_type compression_type;
uint32_t compression_block_size;
uint64_t put_size;
uint32_t header_size;
uint32_t max_entries_per_block;
} sparkey_logheader;
/**
* fills up a logheader struct based on the contents at the beginning of the file.
* @param header header struct to fill
* @param filename a log file
* @returns an error code if it could not load the file.
*/
sparkey_returncode sparkey_load_logheader(sparkey_logheader *header, const char *filename);
/**
* Dumps a human readable representation of the header to stdout
* @param header an initialized header struct
*/
void print_logheader(sparkey_logheader *header);
/**
* Writes a header to the current position in the file
* @param fd a file descripter pointing to a file open for writing
* @param header the header to write
* @returns an error code if it could not write to file.
*/
sparkey_returncode write_logheader(int fd, sparkey_logheader *header);
#endif

511
src/sparkey/logreader.c Normal file
View File

@@ -0,0 +1,511 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#include <string.h>
#include <errno.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <snappy-c.h>
#include "sparkey.h"
#include "sparkey-internal.h"
#include "logheader.h"
#include "endiantools.h"
#include "util.h"
#define MAGIC_VALUE_LOGITER (0xd765c8cc)
#define MAGIC_VALUE_LOGREADER (0xe93356c4)
static inline uint64_t min64(uint64_t a, uint64_t b) {
if (a < b) {
return a;
}
return b;
}
static inline uint64_t read_vlq(uint8_t * array, uint64_t *position) {
uint64_t res = 0;
uint64_t shift = 0;
uint64_t tmp, tmp2;
while (1) {
tmp = array[(*position)++];
tmp2 = tmp & 0x7f;
if (tmp == tmp2) {
return res | tmp << shift;
}
res |= tmp2 << shift;
shift += 7;
}
return res;
}
sparkey_returncode sparkey_logreader_open_noalloc(sparkey_logreader *log, const char *filename) {
int fd = 0;
sparkey_returncode returncode;
TRY(sparkey_load_logheader(&log->header, filename), cleanup);
log->data_len = log->header.data_end;
struct stat s;
stat(filename, &s);
if (log->data_len > (uint64_t) s.st_size) {
returncode = SPARKEY_LOG_TOO_SMALL;
goto cleanup;
}
fd = open(filename, O_RDONLY);
if (fd < 0) {
returncode = sparkey_open_returncode(errno);
goto cleanup;
}
log->fd = fd;
log->data = mmap(NULL, log->data_len, PROT_READ, MAP_SHARED, fd, 0);
if (log->data == MAP_FAILED) {
returncode = SPARKEY_MMAP_FAILED;
goto cleanup;
}
log->open_status = MAGIC_VALUE_LOGREADER;
return SPARKEY_SUCCESS;
cleanup:
if (fd > 0) close(fd);
return returncode;
}
sparkey_returncode sparkey_logreader_open(sparkey_logreader **log_ref, const char *filename) {
RETHROW(correct_endian_platform());
sparkey_logreader *log = malloc(sizeof(sparkey_logreader));
if (log == NULL) {
return SPARKEY_INTERNAL_ERROR;
}
sparkey_returncode returncode;
TRY(sparkey_logreader_open_noalloc(log, filename), cleanup);
*log_ref = log;
return SPARKEY_SUCCESS;
cleanup:
free(log);
return returncode;
}
void sparkey_logreader_close_nodealloc(sparkey_logreader *log) {
if (log == NULL) {
return;
}
if (log->open_status != MAGIC_VALUE_LOGREADER) {
return;
}
log->open_status = 0;
if (log->data != NULL) {
munmap(log->data, log->data_len);
log->data = NULL;
}
close(log->fd);
log->fd = -1;
}
void sparkey_logreader_close(sparkey_logreader **log_ref) {
if (log_ref == NULL) {
return;
}
sparkey_logreader *log = *log_ref;
sparkey_logreader_close_nodealloc(log);
free(log);
*log_ref = NULL;
}
static sparkey_returncode assert_log_open(sparkey_logreader *log) {
if (log->open_status != MAGIC_VALUE_LOGREADER) {
return SPARKEY_LOG_CLOSED;
}
return SPARKEY_SUCCESS;
}
static sparkey_returncode assert_iter_open(sparkey_logiter *iter, sparkey_logreader *log) {
RETHROW(assert_log_open(log));
if (iter->open_status != MAGIC_VALUE_LOGITER) {
return SPARKEY_LOG_ITERATOR_CLOSED;
}
if (iter->file_identifier != log->header.file_identifier) {
return SPARKEY_LOG_ITERATOR_MISMATCH;
}
return SPARKEY_SUCCESS;
}
sparkey_returncode sparkey_logiter_create(sparkey_logiter **iter_ref, sparkey_logreader *log) {
RETHROW(assert_log_open(log));
sparkey_logiter *iter = malloc(sizeof(sparkey_logiter));
if (iter == NULL) {
return SPARKEY_INTERNAL_ERROR;
}
iter->open_status = MAGIC_VALUE_LOGITER;
iter->file_identifier = log->header.file_identifier;
iter->block_position = 0;
iter->next_block_position = log->header.header_size;
iter->block_offset = 0;
iter->block_len = 0;
iter->state = SPARKEY_ITER_NEW;
switch (log->header.compression_type) {
case SPARKEY_COMPRESSION_NONE:
iter->compression_buf_allocated = 0;
break;
case SPARKEY_COMPRESSION_SNAPPY:
iter->compression_buf_allocated = 1;
iter->compression_buf = malloc(log->header.compression_block_size);
if (iter->compression_buf == NULL) {
free(iter);
return SPARKEY_INTERNAL_ERROR;
}
break;
default:
free(iter);
return SPARKEY_INTERNAL_ERROR;
}
*iter_ref = iter;
return SPARKEY_SUCCESS;
}
void sparkey_logiter_close(sparkey_logiter **iter_ref) {
if (iter_ref == NULL) {
return;
}
sparkey_logiter *iter = *iter_ref;
if (iter == NULL) {
return;
}
if (iter->open_status != MAGIC_VALUE_LOGITER) {
return;
}
iter->open_status = 0;
if (iter->compression_buf_allocated) {
free(iter->compression_buf);
}
free(iter);
*iter_ref = NULL;
}
static sparkey_returncode seekblock(sparkey_logiter *iter, sparkey_logreader *log, uint64_t position) {
iter->block_offset = 0;
if (iter->block_position == position) {
return SPARKEY_SUCCESS;
}
if (log->header.compression_type == SPARKEY_COMPRESSION_NONE) {
iter->compression_buf = &log->data[position];
iter->block_position = position;
iter->next_block_position = log->header.data_end;
iter->block_len = log->data_len - position;
return SPARKEY_SUCCESS;
}
if (log->header.compression_type == SPARKEY_COMPRESSION_SNAPPY) {
uint64_t pos = position;
// TODO: assert that size_t >= uint64_t
size_t compressed_size = read_vlq(log->data, &pos);
uint64_t next_pos = pos + compressed_size;
const char *input = (char *) &log->data[pos];
size_t uncompressed_size = log->header.compression_block_size;
snappy_status status = snappy_uncompress(input, compressed_size, (char *) iter->compression_buf, &uncompressed_size);
switch (status) {
case SNAPPY_OK: break;
case SNAPPY_INVALID_INPUT:
return SPARKEY_INTERNAL_ERROR;
case SNAPPY_BUFFER_TOO_SMALL:
return SPARKEY_INTERNAL_ERROR;
default:
return SPARKEY_INTERNAL_ERROR;
}
iter->block_position = position;
iter->next_block_position = next_pos;
iter->block_len = uncompressed_size;
return SPARKEY_SUCCESS;
}
return SPARKEY_INTERNAL_ERROR;
}
sparkey_returncode sparkey_logiter_seek(sparkey_logiter *iter, sparkey_logreader *log, uint64_t position) {
RETHROW(assert_iter_open(iter, log));
if (position == log->header.data_end) {
iter->state = SPARKEY_ITER_CLOSED;
return SPARKEY_SUCCESS;
}
RETHROW(seekblock(iter, log, position));
iter->entry_count = -1;
iter->state = SPARKEY_ITER_NEW;
return SPARKEY_SUCCESS;
}
static sparkey_returncode ensure_available(sparkey_logiter *iter, sparkey_logreader *log) {
if (iter->block_offset < iter->block_len) {
return SPARKEY_SUCCESS;
}
if (iter->next_block_position >= log->header.data_end) {
iter->block_position = 0;
iter->block_offset = 0;
iter->block_len = 0;
return SPARKEY_SUCCESS;
}
RETHROW(seekblock(iter, log, iter->next_block_position));
iter->entry_count = -1;
return SPARKEY_SUCCESS;
}
static sparkey_returncode skip(sparkey_logiter *iter, sparkey_logreader *log, uint64_t len) {
while (len > 0) {
RETHROW(ensure_available(iter, log));
uint64_t m = min64(len, iter->block_len - iter->block_offset);
len -= m;
iter->block_offset += m;
}
return SPARKEY_SUCCESS;
}
sparkey_returncode sparkey_logiter_next(sparkey_logiter *iter, sparkey_logreader *log) {
if (iter->state == SPARKEY_ITER_CLOSED) {
return SPARKEY_SUCCESS;
}
uint64_t key_remaining = 0;
uint64_t value_remaining = 0;
if (iter->state == SPARKEY_ITER_ACTIVE) {
key_remaining = iter->key_remaining;
value_remaining = iter->value_remaining;
}
iter->state = SPARKEY_ITER_INVALID;
iter->key_remaining = 0;
iter->value_remaining = 0;
iter->keylen = 0;
iter->valuelen = 0;
RETHROW(assert_iter_open(iter, log));
RETHROW(skip(iter, log, key_remaining));
RETHROW(skip(iter, log, value_remaining));
RETHROW(ensure_available(iter, log));
if (iter->block_len - iter->block_offset == 0) {
// Reached end of data
iter->state = SPARKEY_ITER_CLOSED;
return SPARKEY_SUCCESS;
}
if (log->header.compression_type == SPARKEY_COMPRESSION_NONE) {
iter->block_position += iter->block_offset;
iter->block_len -= iter->block_offset;
iter->block_offset = 0;
iter->compression_buf = &log->data[iter->block_position];
iter->entry_count = -1;
}
iter->entry_count++;
uint64_t a = read_vlq(iter->compression_buf, &iter->block_offset);
uint64_t b = read_vlq(iter->compression_buf, &iter->block_offset);
if (a == 0) {
iter->keylen = iter->key_remaining = b;
iter->valuelen = iter->value_remaining = 0;
iter->type = SPARKEY_ENTRY_DELETE;
} else {
iter->keylen = iter->key_remaining = a - 1;
iter->valuelen = iter->value_remaining = b;
iter->type = SPARKEY_ENTRY_PUT;
}
iter->entry_block_position = iter->block_position;
iter->entry_block_offset = iter->block_offset;
iter->state = SPARKEY_ITER_ACTIVE;
return SPARKEY_SUCCESS;
}
sparkey_returncode sparkey_logiter_reset(sparkey_logiter *iter, sparkey_logreader *log) {
if (iter->state != SPARKEY_ITER_ACTIVE) {
return SPARKEY_LOG_ITERATOR_INACTIVE;
}
RETHROW(seekblock(iter, log, iter->entry_block_position));
iter->key_remaining = iter->keylen;
iter->value_remaining = iter->valuelen;
iter->block_offset = iter->entry_block_offset;
return SPARKEY_SUCCESS;
}
sparkey_returncode sparkey_logiter_skip(sparkey_logiter *iter, sparkey_logreader *log, int count) {
while (count > 0) {
count--;
RETHROW(sparkey_logiter_next(iter, log));
}
return SPARKEY_SUCCESS;
}
static sparkey_returncode sparkey_logiter_chunk(sparkey_logiter *iter, sparkey_logreader *log, uint64_t maxlen, uint64_t *len, uint8_t ** res, uint64_t *var) {
RETHROW(assert_iter_open(iter, log));
if (iter->state != SPARKEY_ITER_ACTIVE) {
return SPARKEY_LOG_ITERATOR_INACTIVE;
}
if (*var > 0) {
RETHROW(ensure_available(iter, log));
uint64_t m = min64(*var, iter->block_len - iter->block_offset);
m = min64(maxlen, m);
*len = m;
*res = &iter->compression_buf[iter->block_offset];
iter->block_offset += m;
*var -= m;
return SPARKEY_SUCCESS;
}
*len = 0;
return SPARKEY_SUCCESS;
}
sparkey_returncode sparkey_logiter_keychunk(sparkey_logiter *iter, sparkey_logreader *log, uint64_t maxlen, uint8_t ** res, uint64_t *len) {
return sparkey_logiter_chunk(iter, log, maxlen, len, res, &iter->key_remaining);
}
sparkey_returncode sparkey_logiter_valuechunk(sparkey_logiter *iter, sparkey_logreader *log, uint64_t maxlen, uint8_t ** res, uint64_t *len) {
RETHROW(skip(iter, log, iter->key_remaining));
iter->key_remaining = 0;
return sparkey_logiter_chunk(iter, log, maxlen, len, res, &iter->value_remaining);
}
sparkey_returncode sparkey_logiter_fill_key(sparkey_logiter *iter, sparkey_logreader *log, uint64_t maxlen, uint8_t *buf, uint64_t *len) {
*len = 0;
while (maxlen > 0) {
uint8_t *buf2;
uint64_t len2;
RETHROW(sparkey_logiter_keychunk(iter, log, maxlen, &buf2, &len2));
if (len2 == 0) {
return SPARKEY_SUCCESS;
}
memcpy(buf, buf2, len2);
buf += len2;
*len += len2;
maxlen -= len2;
}
return SPARKEY_SUCCESS;
}
sparkey_returncode sparkey_logiter_fill_value(sparkey_logiter *iter, sparkey_logreader *log, uint64_t maxlen, uint8_t *buf, uint64_t *len) {
*len = 0;
while (maxlen > 0) {
uint8_t *buf2;
uint64_t len2;
RETHROW(sparkey_logiter_valuechunk(iter, log, maxlen, &buf2, &len2));
if (len2 == 0) {
return SPARKEY_SUCCESS;
}
memcpy(buf, buf2, len2);
buf += len2;
*len += len2;
maxlen -= len2;
}
return SPARKEY_SUCCESS;
}
sparkey_returncode sparkey_logiter_keycmp(sparkey_logiter *iter1, sparkey_logiter *iter2, sparkey_logreader *log, int *res) {
uint8_t *first;
uint64_t first_len;
uint8_t *second;
uint64_t second_len;
RETHROW(sparkey_logiter_keychunk(iter1, log, 1 << 30, &first, &first_len));
RETHROW(sparkey_logiter_keychunk(iter2, log, 1 << 30, &second, &second_len));
while (1) {
if (first_len == 0 && second_len == 0) {
break;
}
if (first_len == 0) {
*res = -1;
return SPARKEY_SUCCESS;
}
if (second_len == 0) {
*res = 1;
return SPARKEY_SUCCESS;
}
uint64_t cmp_len = min64(first_len, second_len);
int v = memcmp(first, second, cmp_len);
if (v) {
*res = v;
return SPARKEY_SUCCESS;
}
first += cmp_len;
first_len -= cmp_len;
second += cmp_len;
second_len -= cmp_len;
if (first_len == 0) {
RETHROW(sparkey_logiter_keychunk(iter1, log, 1 << 30, &first, &first_len));
}
if (second_len == 0) {
RETHROW(sparkey_logiter_keychunk(iter2, log, 1 << 30, &second, &second_len));
}
}
*res = 0;
return SPARKEY_SUCCESS;
}
uint64_t sparkey_logreader_maxkeylen(sparkey_logreader *log) {
return log->header.max_key_len;
}
uint64_t sparkey_logreader_maxvaluelen(sparkey_logreader *log) {
return log->header.max_value_len;
}
int sparkey_logreader_get_compression_blocksize(sparkey_logreader *log) {
return log->header.compression_block_size;
}
sparkey_compression_type sparkey_logreader_get_compression_type(sparkey_logreader *log) {
return log->header.compression_type;
}
sparkey_iter_state sparkey_logiter_state(sparkey_logiter *iter) {
return iter->state;
}
sparkey_entry_type sparkey_logiter_type(sparkey_logiter *iter) {
return iter->type;
}
uint64_t sparkey_logiter_keylen(sparkey_logiter *iter) {
return iter->keylen;
}
uint64_t sparkey_logiter_valuelen(sparkey_logiter *iter) {
return iter->valuelen;
}

339
src/sparkey/logwriter.c Normal file
View File

@@ -0,0 +1,339 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include <snappy-c.h>
#include "util.h"
#include "sparkey.h"
#include "logheader.h"
#include "endiantools.h"
#include "buf.h"
#include "sparkey-internal.h"
#define MAGIC_VALUE_LOGWRITER (0x2866211b)
static inline int write_vlq(uint8_t *buf, uint64_t value) {
int count = 1;
while (value >= 1 << 7) {
*buf = (value & 0x7f) | 0x80;
value >>= 7;
count++;
buf++;
}
*buf = value;
return count;
}
static sparkey_returncode assert_writer_open(sparkey_logwriter *log) {
if (log->open_status != MAGIC_VALUE_LOGWRITER) {
return SPARKEY_LOG_CLOSED;
}
return SPARKEY_SUCCESS;
}
sparkey_returncode sparkey_logwriter_create(sparkey_logwriter **log_ref, const char *filename, sparkey_compression_type compression_type, int compression_block_size) {
sparkey_returncode returncode;
int fd = 0;
sparkey_logwriter *l = malloc(sizeof(sparkey_logwriter));
if (l == NULL) {
TRY(SPARKEY_INTERNAL_ERROR, error);
}
switch (compression_type) {
case SPARKEY_COMPRESSION_NONE:
compression_block_size = 0;
l->compressed = NULL;
break;
case SPARKEY_COMPRESSION_SNAPPY:
if (compression_block_size < 10) {
TRY(SPARKEY_INVALID_COMPRESSION_BLOCK_SIZE, error);
}
l->max_compressed_size = snappy_max_compressed_length(compression_block_size);
l->compressed = malloc(l->max_compressed_size);
if (l->compressed == NULL) {
TRY(SPARKEY_INTERNAL_ERROR, error);
}
break;
default:
TRY(SPARKEY_INVALID_COMPRESSION_TYPE, error);
}
// Try removing it first, to avoid overwriting existing files that readers may be using.
if (remove(filename) < 0) {
int e = errno;
if (e != ENOENT) {
TRY(sparkey_remove_returncode(e), error);
}
}
fd = open(filename, O_WRONLY | O_TRUNC | O_CREAT, 00644);
if (fd == -1) {
TRY(sparkey_create_returncode(errno), error);
}
l->fd = fd;
l->header.compression_block_size = compression_block_size;
l->header.compression_type = compression_type;
TRY(rand32(&(l->header.file_identifier)), error);
l->header.data_end = LOG_HEADER_SIZE;
l->header.major_version = LOG_MAJOR_VERSION;
l->header.minor_version = LOG_MINOR_VERSION;
l->header.put_size = 0;
l->header.delete_size = 0;
l->header.num_puts = 0;
l->header.num_deletes = 0;
l->header.max_entries_per_block = 0;
l->header.max_key_len = 0;
l->header.max_value_len = 0;
TRY(write_logheader(fd, &l->header), error);
off_t pos = lseek(fd, 0, SEEK_CUR);
if (pos != LOG_HEADER_SIZE) {
TRY(SPARKEY_INTERNAL_ERROR, error);
}
TRY(buf_init(&l->file_buf, 1024*1024), error);
TRY(buf_init(&l->block_buf, compression_block_size), error);
l->entry_count = 0;
l->open_status = MAGIC_VALUE_LOGWRITER;
*log_ref = l;
return SPARKEY_SUCCESS;
error:
free(l);
if (fd > 0) close(fd);
return returncode;
}
sparkey_returncode sparkey_logwriter_append(sparkey_logwriter **log_ref, const char *filename) {
sparkey_returncode returncode;
int fd = 0;
sparkey_logwriter *log = malloc(sizeof(sparkey_logwriter));
if (log == NULL) {
TRY(SPARKEY_INTERNAL_ERROR, error);
}
TRY(sparkey_load_logheader(&log->header, filename), error);
if (log->header.major_version != LOG_MAJOR_VERSION) {
TRY(SPARKEY_WRONG_LOG_MAJOR_VERSION, error);
}
if (log->header.minor_version != LOG_MINOR_VERSION) {
TRY(SPARKEY_UNSUPPORTED_LOG_MINOR_VERSION, error);
}
switch (log->header.compression_type) {
case SPARKEY_COMPRESSION_NONE:
log->header.compression_block_size = 0;
log->compressed = NULL;
break;
case SPARKEY_COMPRESSION_SNAPPY:
if (log->header.compression_block_size < 10) {
TRY(SPARKEY_INVALID_COMPRESSION_BLOCK_SIZE, error);
}
log->max_compressed_size = snappy_max_compressed_length(log->header.compression_block_size);
log->compressed = malloc(log->max_compressed_size);
break;
default:
TRY(SPARKEY_INVALID_COMPRESSION_TYPE, error);
}
fd = open(filename, O_WRONLY, 00644);
if (fd == -1) {
int e = errno;
TRY(sparkey_create_returncode(e), error);
}
log->fd = fd;
lseek(fd, log->header.data_end, SEEK_SET);
TRY(buf_init(&log->file_buf, 1024*1024), error);
TRY(buf_init(&log->block_buf, log->header.compression_block_size), error);
log->entry_count = 0;
log->open_status = MAGIC_VALUE_LOGWRITER;
*log_ref = log;
return SPARKEY_SUCCESS;
error:
free(log);
if (fd > 0) close(fd);
return returncode;
}
static sparkey_returncode flush_snappy(sparkey_logwriter *log) {
log->flushed = 1;
if (log->entry_count > (int) log->header.max_entries_per_block) {
log->header.max_entries_per_block = log->entry_count;
}
log->entry_count = 0;
sparkey_buf *block_buf = &log->block_buf;
uint8_t *compressed = log->compressed;
uint32_t max_compressed_size = log->max_compressed_size;
sparkey_buf *file_buf = &log->file_buf;
int fd = log->fd;
size_t compressed_size = max_compressed_size;
snappy_status status = snappy_compress((char *) block_buf->start, buf_used(block_buf), (char *) compressed, &compressed_size);
switch (status) {
case SNAPPY_OK: break;
case SNAPPY_INVALID_INPUT:
case SNAPPY_BUFFER_TOO_SMALL:
default:
return SPARKEY_INTERNAL_ERROR;
}
uint8_t buf1[10];
ptrdiff_t written1 = write_vlq(buf1, compressed_size);
RETHROW(buf_add(file_buf, fd, buf1, written1));
RETHROW(buf_add(file_buf, fd, compressed, compressed_size));
block_buf->cur = block_buf->start;
return SPARKEY_SUCCESS;
}
sparkey_returncode sparkey_logwriter_flush(sparkey_logwriter *log) {
RETHROW(assert_writer_open(log));
if (buf_used(&log->block_buf) > 0) {
RETHROW(flush_snappy(log));
}
if (buf_used(&log->file_buf) > 0) {
RETHROW(buf_flushfile(&log->file_buf, log->fd));
}
off_t pos = lseek(log->fd, 0, SEEK_CUR);
log->header.data_end = pos;
lseek(log->fd, 0, SEEK_SET);
RETHROW(write_logheader(log->fd, &log->header));
lseek(log->fd, pos, SEEK_SET);
/* Can't build fsync support on lenny */
/* fsync(log->fd); */
return SPARKEY_SUCCESS;
}
sparkey_returncode sparkey_logwriter_close(sparkey_logwriter **log) {
sparkey_logwriter *l = *log;
if (l->open_status != MAGIC_VALUE_LOGWRITER) {
return SPARKEY_SUCCESS;
}
RETHROW(sparkey_logwriter_flush(l));
close(l->fd);
buf_close(&l->file_buf);
buf_close(&l->block_buf);
if (l->compressed != NULL) {
free(l->compressed);
}
l->open_status = 0;
free(l);
*log = NULL;
return SPARKEY_SUCCESS;
}
static sparkey_returncode snappy_add(sparkey_logwriter *log, const uint8_t *data, ptrdiff_t len) {
sparkey_buf *block_buf = &log->block_buf;
while (1) {
ptrdiff_t remaining = buf_remaining(block_buf);
if (remaining >= len) {
memcpy(block_buf->cur, data, len);
block_buf->cur += len;
return SPARKEY_SUCCESS;
} else {
memcpy(block_buf->cur, data, remaining);
block_buf->cur += remaining;
data += remaining;
len -= remaining;
RETHROW(flush_snappy(log));
}
}
return SPARKEY_SUCCESS;
}
static sparkey_returncode log_add(sparkey_logwriter *log, uint64_t num1, uint64_t num2, uint64_t len1, const uint8_t *data1, uint64_t len2, const uint8_t *data2, ptrdiff_t *datasize) {
uint8_t buf1[10];
uint8_t buf2[10];
uint64_t written1 = write_vlq(buf1, num1);
uint64_t written2 = write_vlq(buf2, num2);
*datasize = written1 + written2 + len1 + len2;
uint64_t remaining;
switch (log->header.compression_type) {
case SPARKEY_COMPRESSION_NONE:
RETHROW(buf_add(&log->file_buf, log->fd, buf1, written1));
RETHROW(buf_add(&log->file_buf, log->fd, buf2, written2));
RETHROW(buf_add(&log->file_buf, log->fd, data1, len1));
RETHROW(buf_add(&log->file_buf, log->fd, data2, len2));
break;
case SPARKEY_COMPRESSION_SNAPPY:
remaining = buf_remaining(&log->block_buf);
// todo: make it smarter by checking if it's better to flush directly
uint64_t fits_in_one = written1 + written2 + len1 + len2 <= buf_size(&log->block_buf);
uint64_t doesnt_fit_this = written1 + written2 + len1 + len2 > buf_remaining(&log->block_buf);
if ((remaining < written1 + written2) || (fits_in_one && doesnt_fit_this)) {
RETHROW(flush_snappy(log));
}
log->entry_count++;
log->flushed = 0;
RETHROW(snappy_add(log, buf1, written1));
RETHROW(snappy_add(log, buf2, written2));
RETHROW(snappy_add(log, data1, len1));
RETHROW(snappy_add(log, data2, len2));
if (log->flushed && buf_used(&log->block_buf) > 0) {
RETHROW(flush_snappy(log));
}
break;
default:
return SPARKEY_INTERNAL_ERROR;
}
return SPARKEY_SUCCESS;
}
sparkey_returncode sparkey_logwriter_put(sparkey_logwriter *log, uint64_t keylen, const uint8_t *key, uint64_t valuelen, const uint8_t *value) {
RETHROW(assert_writer_open(log));
ptrdiff_t datasize;
RETHROW(log_add(log, keylen + 1, valuelen, keylen, key, valuelen, value, &datasize));
log->header.num_puts++;
log->header.put_size += datasize;
if (keylen > log->header.max_key_len) {
log->header.max_key_len = keylen;
}
if (valuelen > log->header.max_value_len) {
log->header.max_value_len = valuelen;
}
return SPARKEY_SUCCESS;
}
sparkey_returncode sparkey_logwriter_delete(sparkey_logwriter *log, uint64_t keylen, const uint8_t *key) {
RETHROW(assert_writer_open(log));
ptrdiff_t datasize;
RETHROW(log_add(log, 0, keylen, 0, NULL, keylen, key, &datasize));
log->header.num_deletes++;
log->header.delete_size += datasize;
return SPARKEY_SUCCESS;
}

60
src/sparkey/returncodes.c Normal file
View File

@@ -0,0 +1,60 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#include "sparkey.h"
const char * sparkey_errstring(sparkey_returncode code) {
switch (code) {
case SPARKEY_SUCCESS: return "Success";
case SPARKEY_INTERNAL_ERROR: return "Internal error";
case SPARKEY_FILE_NOT_FOUND: return "File not found";
case SPARKEY_PERMISSION_DENIED: return "Permission denied";
case SPARKEY_TOO_MANY_OPEN_FILES: return "Too many open files";
case SPARKEY_FILE_TOO_LARGE: return "File is too large";
case SPARKEY_FILE_ALREADY_EXISTS: return "File already exists";
case SPARKEY_FILE_BUSY: return "File is busy";
case SPARKEY_FILE_IS_DIRECTORY: return "File is a directory";
case SPARKEY_FILE_SIZE_EXCEEDED: return "Maximum file size exceeded";
case SPARKEY_FILE_CLOSED: return "File is closed";
case SPARKEY_OUT_OF_DISK: return "Out of free disk space";
case SPARKEY_UNEXPECTED_EOF: return "Encountered unexpected end of file";
case SPARKEY_MMAP_FAILED: return "mmap failed - running on 32 bit system?";
case SPARKEY_WRONG_LOG_MAGIC_NUMBER: return "Wrong magic number of log file";
case SPARKEY_WRONG_LOG_MAJOR_VERSION: return "Wrong major version of log file";
case SPARKEY_UNSUPPORTED_LOG_MINOR_VERSION: return "Unsupported minor version of log file";
case SPARKEY_LOG_TOO_SMALL: return "Corrupt log file - smaller than the header indicates";
case SPARKEY_LOG_CLOSED: return "Log file is closed";
case SPARKEY_LOG_ITERATOR_INACTIVE: return "Log iterator is inactive";
case SPARKEY_LOG_ITERATOR_MISMATCH: return "The iterator is not associated with the log";
case SPARKEY_LOG_ITERATOR_CLOSED: return "Log iterator is closed";
case SPARKEY_LOG_HEADER_CORRUPT: return "Log header is corrupt";
case SPARKEY_INVALID_COMPRESSION_BLOCK_SIZE: return "Invalid compression block size";
case SPARKEY_INVALID_COMPRESSION_TYPE: return "Invalid compression type";
case SPARKEY_WRONG_HASH_MAGIC_NUMBER: return "Wrong magic number of hash file";
case SPARKEY_WRONG_HASH_MAJOR_VERSION: return "Wrong major version of hash file";
case SPARKEY_UNSUPPORTED_HASH_MINOR_VERSION: return "Unsupported minor version of hash file";
case SPARKEY_HASH_TOO_SMALL: return "Corrupt hash file - smaller than the header indicates";
case SPARKEY_HASH_CLOSED: return "Hash file is closed";
case SPARKEY_FILE_IDENTIFIER_MISMATCH: return "File identifier differs between hash file and log file";
case SPARKEY_HASH_HEADER_CORRUPT: return "Hash header is corrupt";
case SPARKEY_HASH_SIZE_INVALID: return "Hash size is invalid";
default: return "Unknown error";
}
}

View File

@@ -0,0 +1,90 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#ifndef SPARKEY_INTERNAL_H
#define SPARKEY_INTERNAL_H
#include <stdint.h>
#include "sparkey.h"
#include "logheader.h"
#include "hashheader.h"
#include "buf.h"
struct sparkey_logreader {
uint32_t open_status;
sparkey_logheader header;
int fd;
uint64_t data_len;
uint8_t *data;
};
struct sparkey_logiter {
uint32_t open_status;
uint32_t file_identifier;
// position in reader
uint64_t block_position;
uint64_t next_block_position;
uint64_t block_offset;
uint64_t block_len;
int entry_count;
// compression buffer
int compression_buf_allocated;
uint8_t *compression_buf;
// current entry
uint64_t entry_block_position;
uint64_t entry_block_offset;
sparkey_entry_type type;
sparkey_iter_state state;
uint64_t keylen;
uint64_t valuelen;
uint64_t key_remaining;
uint64_t value_remaining;
};
struct sparkey_logwriter {
uint32_t open_status;
sparkey_logheader header;
int fd;
sparkey_buf block_buf;
uint32_t max_compressed_size;
uint8_t *compressed;
sparkey_buf file_buf;
int flushed;
int entry_count;
};
struct sparkey_hashreader {
uint32_t open_status;
sparkey_hashheader header;
sparkey_logreader log;
int fd;
uint64_t data_len;
uint8_t *data;
};
sparkey_returncode sparkey_logreader_open_noalloc(sparkey_logreader *log, const char *filename);
void sparkey_logreader_close_nodealloc(sparkey_logreader *log);
#endif

673
src/sparkey/sparkey.h Normal file
View File

@@ -0,0 +1,673 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#ifndef SPOTIFY_SPARKEY_H_INCLUDED
#define SPOTIFY_SPARKEY_H_INCLUDED
/**
* \mainpage Sparkey C API
* \section intro_sec Getting started
*
* For a complete listing of available functions, see sparkey.h .
*
* \section logwriter Writing to a log file
*
* This section contains all functions relevant for writing entries to a log.
* Writing to the same log file is not thread safe. Only use the writer objects
* from one thread at a time, and make sure to only write to a file from one process.
* The library will not do any form of locking or checking for other writers, so be
* careful.
*
* Basic workflow:
* - Create and initialize the logwriter:
* \code
* sparkey_logwriter *mywriter;
* sparkey_returncode returncode = sparkey_logwriter_create(&mywriter, "mylog.spl", SPARKEY_COMPRESSION_NONE, 0);
* // TODO: check the returncode
* \endcode
* - Write to the log:
* \code
* const char *mykey = "mykey";
* const char *myvalue = "this is my value";
* sparkey_returncode returncode = sparkey_logwriter_put(mywriter, strlen(mykey), (uint8_t*)mykey, strlen(myvalue), (uint8_t*)myvalue);
* // TODO: check the returncode
* \endcode
* - Close it when you're done:
* \code
* sparkey_returncode returncode = sparkey_logwriter_close(&mywriter);
* // TODO: check the returncode
* \endcode
*
* \section logreader Reading from a log file
*
* This section contains all functions relevant for reading entries from a log.
* A sparkey_logreader may be shared between multiple threads, but \ref sparkey_logreader_open
* and \ref sparkey_logreader_close are not thread safe.
*
* The logreader is not useful by itself. You also need a sparkey_logiter to iterate through the entries.
* This is a highly mutable struct and should not be shared between threads. It is not threadsafe.
*
* Here is a basic workflow for iterating through all entries in a logfile:
* - Create a logreader
* \code
* sparkey_logreader *myreader;
* sparkey_returncode returncode = sparkey_logreader_open(&myreader, "mylog.spl");
* \endcode
* - Create a logiter
* \code
* sparkey_logiter *myiter;
* sparkey_returncode returncode = sparkey_logiter_create(&myiter, myreader);
* \endcode
* - Perform the iteration:
* \code
* while (1) {
* sparkey_returncode returncode = sparkey_logiter_next(myiter, myreader);
* // TODO: check the returncode
* if (sparkey_logiter_state(myiter) != SPARKEY_ITER_ACTIVE) {
* break;
* }
* uint64_t wanted_keylen = sparkey_logiter_keylen(myiter);
* uint8_t *keybuf = malloc(wanted_keylen);
* uint64_t actual_keylen;
* returncode = sparkey_logiter_fill_key(myiter, myreader, wanted_keylen, keybuf, &actual_keylen);
* // TODO: check the returncode
* // TODO: assert actual_keylen == wanted_keylen
* uint64_t wanted_valuelen = sparkey_logiter_valuelen(myiter);
* uint8_t *valuebuf = malloc(wanted_valuelen);
* uint64_t actual_valuelen;
* returncode = sparkey_logiter_fill_value(myiter, myreader, wanted_valuelen, valuebuf, &actual_valuelen);
* // TODO: check the returncode
* // TODO: assert actual_valuelen == wanted_valuelen
* // Do stuff with key and value
* free(keybuf);
* free(valuebuf);
* }
* \endcode
* Note that you have to allocate memory for the key and value manually - Sparkey does not allocate memory except for when
* creating readers, writers and iterators.
*
* - Alternatively, you can preallocate the buffers by using
* max_key_len and max_value_len provided by the log header:
* \code
* uint8_t *keybuf = malloc(sparkey_logreader_maxkeylen(sparkey_hash_getreader(myreader)));
* uint8_t *valuebuf = malloc(sparkey_logreader_maxvaluelen(sparkey_hash_getreader(myreader)));
* while (1) {
* sparkey_returncode returncode = sparkey_logiter_next(&myiter, &myreader);
* // TODO: check the returncode
* if (sparkey_logiter_state(myiter) != SPARKEY_ITER_ACTIVE) {
* break;
* }
* uint64_t wanted_keylen = sparkey_logiter_keylen(myiter);
* uint64_t actual_keylen;
* returncode = sparkey_logiter_fill_key(&myiter, &myreader, wanted_keylen, keybuf, &actual_keylen);
* // TODO: check the returncode
* // TODO: assert actual_keylen == wanted_keylen
* uint64_t wanted_valuelen = sparkey_logiter_valuelen(myiter);
* uint64_t actual_valuelen;
* returncode = sparkey_logiter_fill_value(&myiter, &myreader, wanted_valuelen, valuebuf, &actual_valuelen);
* // TODO: check the returncode
* // TODO: assert actual_valuelen == wanted_valuelen
* // Do stuff with key and value
* }
* free(keybuf);
* free(valuebuf);
* \endcode
* - You can also skip allocating at all, if you can process the key and/or value in chunks. Here's an example for processing a key in chunks,
* but the same can be applied for values:
* \code
* uint64_t total_len = sparkey_logiter_keylen(myiter);
* while (total_len > 0) {
* uint8_t *buf;
* uint64_t len;
* sparkey_returncode returncode = sparkey_logiter_keychunk(&myiter, &myreader, total_len, &buf, &len);
* // TODO: check the returncode
* // Example: use the chunks to write to standard out
* fwrite(buf, 1, len, stdout);
* total_len -= len;
* }
* \endcode
* - Close everything when you're done:
* \code
* sparkey_logreader_close(&myreader);
* sparkey_logiter_close(&myiter);
* \endcode
*
* \section hashwriter Creating hash files from log files
*
* This header only contains the function sparkey_hash_write which creates a hash file.
*
* This is all you need to do to create a hash file based on an existing log file:
* \code
* sparkey_returncode returncode = sparkey_hash_write("mylog.spi", "mylog.spl", 0);
* // TODO: check the returncode
* \endcode
*
* \section hashreader Reading from a hash-file/log-file pair
*
* This header contains all functions relevant for reading live key/value pairs from a log and hash file.
* The documentation is very similar to the one for reading from a log file, because this api is an extension.
* Random lookups is the only feature that's added, and iteration simply skips dead entries.
*
* A sparkey_hashreader may be shared between multiple threads, but \ref sparkey_hash_open
* and \ref sparkey_hash_close are not thread safe.
*
* The hashreader is not useful by itself. You also need a sparkey_logiter to do random lookups and
* iterate through the entries.
* This is a highly mutable struct and should not be shared between threads. It is not threadsafe.
*
* Here is a basic workflow for iterating through all live entries in a log and hash file:
* - Create a hashreader
* \code
* sparkey_hashreader *myreader;
* sparkey_returncode returncode = sparkey_hash_open(&myreader, "mylog.spi", "mylog.spl");
* // TODO: check the returncode
* \endcode
* \code
* sparkey_logiter *myiter;
* sparkey_returncode returncode = sparkey_logiter_create(&myiter, sparkey_hash_getreader(myreader));
* // TODO: check the returncode
* \endcode
* - Iteration is exactly as described previously, but uses \ref sparkey_logiter_hashnext instead of
* \ref sparkey_logiter_next.
* - Random lookup
* \code
* sparkey_returncode returncode = sparkey_hash_get(myreader, (uint8_t*)"mykey", 5, myiter);
* if (sparkey_logiter_state(myiter) != SPARKEY_ITER_ACTIVE) {
* // Entry not found;
* } else {
* // Extracting value is done the same as when iterating.
* uint64_t wanted_valuelen = sparkey_logiter_valuelen(myiter);
* uint8_t *valuebuf = malloc(wanted_valuelen);
* uint64_t actual_valuelen;
* returncode = sparkey_logiter_fill_value(myiter, sparkey_hash_getreader(myreader), wanted_valuelen, valuebuf, &actual_valuelen);
* }
* \endcode
* Note that this API allows you to do a random seek and then iterate through the following entries. This may be
* useful when you insert groups of entries in order and quickly want to access all of them.
* - Close everything when you're done:
* \code
* sparkey_hash_close(&myreader);
* sparkey_logiter_close(&myiter);
* \endcode
*/
#include <stdint.h>
#ifdef __cplusplus
extern "C" {
#endif
typedef enum {
SPARKEY_SUCCESS = 0,
SPARKEY_INTERNAL_ERROR = -1,
SPARKEY_FILE_NOT_FOUND = -100,
SPARKEY_PERMISSION_DENIED = -101,
SPARKEY_TOO_MANY_OPEN_FILES = -102,
SPARKEY_FILE_TOO_LARGE = -103,
SPARKEY_FILE_ALREADY_EXISTS = -104,
SPARKEY_FILE_BUSY = -105,
SPARKEY_FILE_IS_DIRECTORY = -106,
SPARKEY_FILE_SIZE_EXCEEDED = -107,
SPARKEY_FILE_CLOSED = -108,
SPARKEY_OUT_OF_DISK = -109,
SPARKEY_UNEXPECTED_EOF = -110,
SPARKEY_MMAP_FAILED = -111,
SPARKEY_WRONG_LOG_MAGIC_NUMBER = -200,
SPARKEY_WRONG_LOG_MAJOR_VERSION = -201,
SPARKEY_UNSUPPORTED_LOG_MINOR_VERSION = -202,
SPARKEY_LOG_TOO_SMALL = -203,
SPARKEY_LOG_CLOSED = -204,
SPARKEY_LOG_ITERATOR_INACTIVE = -205,
SPARKEY_LOG_ITERATOR_MISMATCH = -206,
SPARKEY_LOG_ITERATOR_CLOSED = -207,
SPARKEY_LOG_HEADER_CORRUPT = -208,
SPARKEY_INVALID_COMPRESSION_BLOCK_SIZE = -209,
SPARKEY_INVALID_COMPRESSION_TYPE = -210,
SPARKEY_WRONG_HASH_MAGIC_NUMBER = -300,
SPARKEY_WRONG_HASH_MAJOR_VERSION = -301,
SPARKEY_UNSUPPORTED_HASH_MINOR_VERSION = -302,
SPARKEY_HASH_TOO_SMALL = -303,
SPARKEY_HASH_CLOSED = -304,
SPARKEY_FILE_IDENTIFIER_MISMATCH = -305,
SPARKEY_HASH_HEADER_CORRUPT = -306,
SPARKEY_HASH_SIZE_INVALID = -307,
} sparkey_returncode;
/**
* Get a human readable string from a return code.
* @param code a return code
* @returns a string representing the return code.
*/
const char * sparkey_errstring(sparkey_returncode code);
/* logwriter */
/**
* A structure holding all the data necessary to add entries to a log file.
*/
struct sparkey_logwriter;
typedef struct sparkey_logwriter sparkey_logwriter;
typedef enum {
SPARKEY_COMPRESSION_NONE,
SPARKEY_COMPRESSION_SNAPPY
} sparkey_compression_type;
typedef enum {
SPARKEY_ENTRY_PUT,
SPARKEY_ENTRY_DELETE
} sparkey_entry_type;
typedef enum {
SPARKEY_ITER_NEW,
SPARKEY_ITER_ACTIVE,
SPARKEY_ITER_CLOSED,
SPARKEY_ITER_INVALID
} sparkey_iter_state;
struct sparkey_logreader;
typedef struct sparkey_logreader sparkey_logreader;
struct sparkey_logiter;
typedef struct sparkey_logiter sparkey_logiter;
struct sparkey_hashreader;
typedef struct sparkey_hashreader sparkey_hashreader;
/**
* Creates a new Sparkey log file, possibly overwriting an already existing.
* @param log a double reference to a sparkey_logwriter structure that gets allocated and initialized by this call.
* @param filename the file to create.
* @param compression_type NONE or SNAPPY, specifies if block compression should be used or not.
* @param compression_block_size is only relevant if compression type is not NONE.
* It represents the maximum number of bytes of an uncompressed block.
* @return SPARKEY_SUCCESS if all goes well.
*/
sparkey_returncode sparkey_logwriter_create(sparkey_logwriter **log, const char *filename, sparkey_compression_type compression_type, int compression_block_size);
/**
* Append to an existing Sparkey log file.
* @param log a double reference to a sparkey_logwriter structure that gets allocated and initialized by this call.
* @param filename the file to open for appending.
* It represents the maximum number of bytes of an uncompressed block.
* @return SPARKEY_SUCCESS if all goes well.
*/
sparkey_returncode sparkey_logwriter_append(sparkey_logwriter **log, const char *filename);
/**
* Append a key/value pair to the log file
* @param log a reference to an open log writer.
* @param keylen the number of bytes of the key data block
* @param key a pointer to a block of continuous data where the key can be found.
* Does not need to be NUL-terminated.
* @param valuelen the number of bytes of the value data block
* @param value a pointer to a block of continuous data where the value can be found.
* Does not need to be NUL-terminated.
* @return SPARKEY_SUCCESS if all goes well.
*/
sparkey_returncode sparkey_logwriter_put(sparkey_logwriter *log, uint64_t keylen, const uint8_t *key, uint64_t valuelen, const uint8_t *value);
/**
* Append a delete operation for a key to the log file
* @param log a reference to an open log writer.
* @param keylen the number of bytes of the key data block
* @param key a pointer to a block of continuous data where the key can be found.
* Does not need to be NUL-terminated.
* @return SPARKEY_SUCCESS if all goes well.
*/
sparkey_returncode sparkey_logwriter_delete(sparkey_logwriter *log, uint64_t keylen, const uint8_t *key);
/**
* Flush any open compression block to file buffer.
* Flush any open file buffer to disk.
* Rewrite the header on disk.
* This enables readers to read from the log.
* @param log a reference to an open log writer.
* @return SPARKEY_SUCCESS if all goes well.
*/
sparkey_returncode sparkey_logwriter_flush(sparkey_logwriter *log);
/**
* Flushes the log, then closes the file and marks the log as closed.
* The log will be closed after this, the sparkey_logwriter struct
* referenced will be freed and *log will be set to NULL.
* @param log a double reference to an open log writer.
* @return SPARKEY_SUCCESS if all goes well.
*/
sparkey_returncode sparkey_logwriter_close(sparkey_logwriter **log);
/* logreader */
/**
* Opens a log file for reading. The logreader is threadsafe, except during opening or closing.
* @param log a double reference to a logreader.
* @param filename a filename of a file containing a sparkey log.
* @returns SPARKEY_SUCCESS if all goes well. Otherwise a return code indicating the error.
*/
sparkey_returncode sparkey_logreader_open(sparkey_logreader **log, const char *filename);
/**
* Closes a logreader.
* It's allowed to close a logreader while there are open logiterators.
* Further operations on such logiterators will fail.
* This is a failsafe operation.
* @param log a double reference to a logreader
* This will be set to NULL after close.
*/
void sparkey_logreader_close(sparkey_logreader **log);
/**
* Get the size of the largest key in the log.
* @param log a reference to a logreader.
* @returns
*/
uint64_t sparkey_logreader_maxkeylen(sparkey_logreader *log);
/**
* Get the size of the largest value in the log.
* @param log a reference to a logreader.
* @returns
*/
uint64_t sparkey_logreader_maxvaluelen(sparkey_logreader *log);
/**
* Get the blocksize for a reader
* @param log a reference to a logreader.
* @returns the blocksize
*/
int sparkey_logreader_get_compression_blocksize(sparkey_logreader *log);
/**
* Get the compression type for a reader
* @param log a reference to a logreader.
* @returns the compression type
*/
sparkey_compression_type sparkey_logreader_get_compression_type(sparkey_logreader *log);
/**
* Initializes a logiter and associates it with a logreader.
* The logreader must be open. The logiter is not threadsafe.
* @param iter a double reference to an uninitialized logiter. Will be set on success.
* @param log an open logreader
* @returns SPARKEY_SUCCESS or all goes well. Otherwise a returncode indicating the error.
*/
sparkey_returncode sparkey_logiter_create(sparkey_logiter **iter, sparkey_logreader *log);
/**
* Closes a log iterator.
* This is a failsafe operation.
* @param iter a double reference to a log iterator.
* This will be set to NULL after close.
*/
void sparkey_logiter_close(sparkey_logiter **iter);
/**
* Skips to a specific block in the logfile.
* The position must be a valid block start, but that will not be verified by the function.
* If an illegal position is used, all other operations on this logiterator are undefined,
* and may even segfault.
* @param iter an open log iterator.
* @param log an open logreader associated with iter.
* @param position an offset into the logfile where a block begins.
* @returns SPARKEY_SUCCESS if all goes well. Otherwise a returncode indicating the error.
*/
sparkey_returncode sparkey_logiter_seek(sparkey_logiter *iter, sparkey_logreader *log, uint64_t position);
/**
* Skip a number of entries.
* This is equivalent to calling sparkey_logiter_next count number of times.
* @param iter an open logiter
* @param log an open logreader associated with iter.
* @param count the number of entries to skip.
* @returns SPARKEY_SUCCESS if all goes well. Otherwise a returncode indicating the error.
*/
sparkey_returncode sparkey_logiter_skip(sparkey_logiter *iter, sparkey_logreader *log, int count);
/**
* Prepares the logiter to start reading from the next entry.
* iter->state will be SPARKEY_ITER_CLOSED if the last entry has been passed.
* iter->state will be SPARKEY_ITER_INVALID if anything goes wrong.
* iter->state will be SPARKEY_ITER_ACTIVE if it successfully reached the next entry.
* @param iter an open logiter
* @param log an open logreader associated with iter.
* @returns SPARKEY_SUCCESS if all goes well. Otherwise a returncode indicating the error.
*/
sparkey_returncode sparkey_logiter_next(sparkey_logiter *iter, sparkey_logreader *log);
/**
* Resets the iterator to the start of the current entry. This is only valid if
* iter->state is SPARKEY_ITER_ACTIVE.
* @param iter an open logiter
* @param log an open logreader associated with iter.
* @returns SPARKEY_SUCCESS if all goes well. Otherwise a returncode indicating the error.
*/
sparkey_returncode sparkey_logiter_reset(sparkey_logiter *iter, sparkey_logreader *log);
/**
* Consumes and returns part of or all of the key of the current entry.
* Usage example:
* uint8_t *res;
* uint64_t len;
* sparkey_returncode code = sparkey_logiter_keychunk(iter, log, 1 << 30, &res, &len);
*
* @param iter an open logiter
* @param log an open logreader associated with iter.
* @param maxlen a limit for how much data you want to handle.
* @param res (output parameter) reference to a read only array of data. The array is of size res, and is not NUL-terminated.
* You can not use this as a string, and you may not modify it. The data in the array is valid until the next operation on the
* logiter or until the log is closed.
* @param len (output parameter) reference to a variable holding the size of res.
* @returns SPARKEY_SUCCESS if all goes well. Otherwise a returncode indicating the error.
*/
sparkey_returncode sparkey_logiter_keychunk(sparkey_logiter *iter, sparkey_logreader *log, uint64_t maxlen, uint8_t ** res, uint64_t *len);
/**
* First consumes and discards any remaining key parts.
* Then consumes and returns part of or all of the value of the current entry.
* Usage example:
* uint8_t *res;
* uint64_t len;
* sparkey_returncode code = sparkey_logiter_valuechunk(iter, log, 1 << 30, &res, &len);
*
* @param iter an open logiter
* @param log an open logreader associated with iter.
* @param maxlen a limit for how much data you want to handle.
* @param res (output parameter) reference to a read only array of data. The array is of size res, and is not NUL-terminated.
* You can not use this as a string, and you may not modify it. The data in the array is valid until the next operation on the
* logiter or until the log is closed.
* @param len (output parameter) reference to a variable holding the size of res.
* @returns SPARKEY_SUCCESS if all goes well. Otherwise a returncode indicating the error.
*/
sparkey_returncode sparkey_logiter_valuechunk(sparkey_logiter *iter, sparkey_logreader *log, uint64_t maxlen, uint8_t ** res, uint64_t *len);
/**
* Convenience function around sparkey_logiter_keychunk.
* Takes a user allocated buffer and fills it as much as possible by consuming parts of the key of the current entry.
* No NUL will be appended after the data, so you may not use it as a string unless you add the NUL manually.
* Usage example:
* uint8_t *buf = malloc(iter->keylen);
* uint64_t len;
* sparkey_returncode code = sparkey_logiter_fill_key(iter, log, iter->keylen, buf, &len);
*
* @param iter an open logiter
* @param log an open logreader associated with iter.
* @param maxlen a limit for how much data you want to handle.
* @param buf a writable array of data. The array must at least be of size maxlen.
* @param len (output parameter) reference to a variable holding the amount of data written to buf.
* @returns SPARKEY_SUCCESS if all goes well. Otherwise a returncode indicating the error.
*/
sparkey_returncode sparkey_logiter_fill_key(sparkey_logiter *iter, sparkey_logreader *log, uint64_t maxlen, uint8_t *buf, uint64_t *len);
/**
* Convenience function around sparkey_logiter_valuechunk.
* Takes a user allocated buffer and fills it as much as possible by consuming parts of the key of the current entry.
* No NUL will be appended after the data, so you may not use it as a string unless you add the NUL manually.
* Usage example:
* uint8_t *buf = malloc(iter->valuelen);
* uint64_t len;
* sparkey_returncode code = sparkey_logiter_fill_value(iter, log, iter->valuelen, buf, &len);
*
* @param iter an open logiter
* @param log an open logreader associated with iter.
* @param maxlen a limit for how much data you want to handle.
* @param buf a writable array of data. The array must at least be of size maxlen.
* @param len (output parameter) reference to a variable holding the amount of data written to buf.
* @returns SPARKEY_SUCCESS if all goes well. Otherwise a returncode indicating the error.
*/
sparkey_returncode sparkey_logiter_fill_value(sparkey_logiter *iter, sparkey_logreader *log, uint64_t maxlen, uint8_t *buf, uint64_t *len);
/**
* Compares the keys of two iterators pointing to the same log.
* It assumes that the iterators are both clean, i.e. nothing has been consumed from the current entry.
*
* @param iter1 an open logiter
* @param iter2 an open logiter
* @param log an open logreader associated with iter1 and iter2.
* @param res (output parameter) reference to a variable holding the result of the comparison.
* It will be zero if the keys are equal, negative if key1 is smaller than key2 and positive if key1 is larger than key2.
* The behaviour is thus Like memcmp.
* @returns SPARKEY_SUCCESS if all goes well. Otherwise a returncode indicating the error.
*/
sparkey_returncode sparkey_logiter_keycmp(sparkey_logiter *iter1, sparkey_logiter *iter2, sparkey_logreader *log, int *res);
/**
* Get the state for an iterator.
* @returns iter->state
*/
sparkey_iter_state sparkey_logiter_state(sparkey_logiter *iter);
/**
* Get the type of an iterator.
* @returns iter->type
*/
sparkey_entry_type sparkey_logiter_type(sparkey_logiter *iter);
/**
* Get the keylen of an iterator.
* @returns iter->keylen
*/
uint64_t sparkey_logiter_keylen(sparkey_logiter *iter);
/**
* Get the valuelen of an iterator.
* @returns iter->valuelen
*/
uint64_t sparkey_logiter_valuelen(sparkey_logiter *iter);
/* hashwriter */
/**
* Creates a hash table for a specific log file.
* It's safe and efficient to run this multiple times.
* If the hash file already exists, it will be used to speed up the creation of the new file
* by reusing the existing entries, and only update the new hash table based on
* the entries in the log that are new since the last hash was built.
* Note that the hash file is never overwritten, instead the old file is unlinked from
* the filesystem and the new one is created. Thus, it's safe to rewrite the hash table while
* other processes are reading from it.
* @param hash_filename the file to create and put the sparkey hash table in.
* @param log_filename a file that must exist and be a sparkey log file.
* @param hash_size size of the hashes for keys.
Valid values are 4 (32 bit murmurhash3_x86_32) and
8 (lower 64-bit part of murmurhash3_x64_128).
A value of zero will make it autoselect hash size, depending on number of entries.
* @returns SPARKEY_SUCCESS if all goes well. Otherwise a returncode indicating the error.
*/
sparkey_returncode sparkey_hash_write(const char *hash_filename, const char *log_filename, int hash_size);
/* hashreader */
/**
* Opens a hash file and a log file for reading. The the hashreader is threadsafe, except during opening or closing.
* @param reader a double reference to an uninitialized hashreader. Will be set on success.
* @param hash_filename a filename of a file containing a sparkey hash table.
* @param log_filename a filename of a file containing a sparkey log.
* @returns SPARKEY_SUCCESS if all goes well. Otherwise a return code indicating the error.
*/
sparkey_returncode sparkey_hash_open(sparkey_hashreader **reader, const char *hash_filename, const char *log_filename);
/**
* Gets the logreader that is referenced by the hashreader
* @param reader an open reader.
* @returns the associated logreader
*/
sparkey_logreader * sparkey_hash_getreader(sparkey_hashreader *reader);
/**
* Closes a hashreader.
* It's allowed to close a hashreader while there are open logiterators associated with it.
* Further operations on such logiterators will fail.
* This is a failsafe operation.
* @param reader a double reference to a hashreader
*/
void sparkey_hash_close(sparkey_hashreader **reader);
/**
* Performs a hash table lookup of a key. If the key is found,
* the iterator will have state SPARKEY_ITER_ACTIVE and the key chunk will be consumed.
* Otherwise, the iterator will have state SPARKEY_ITER_INVALID.
* @param reader an open reader.
* @param key a buffer containing the key. It does not have be NUL terminated.
* @param keylen the length of the key.
* @param iter an iterator associated with the reader. Will be mutated.
* @returns SPARKEY_SUCCESS if all goes well. Otherwise a return code indicating the error.
*/
sparkey_returncode sparkey_hash_get(sparkey_hashreader *reader, const uint8_t *key, uint64_t keylen, sparkey_logiter *iter);
/**
* Works the same as sparkey_logiter_next, except it skips entries that are not of type SPARKEY_ENTRY_PUT
* and entries that have been overwritten or deleted. Thus it only stops at live entries.
* iter->state will be SPARKEY_ITER_CLOSED if the last entry has been passed.
* iter->state will be SPARKEY_ITER_INVALID if anything goes wrong.
* iter->state will be SPARKEY_ITER_ACTIVE if it successfully reached the next entry.
* @see sparkey_logiter_next
* @param iter an open logiter
* @param reader an open reader associated with iter.
* @returns SPARKEY_SUCCESS if all goes well. Otherwise a returncode indicating the error.
*/
sparkey_returncode sparkey_logiter_hashnext(sparkey_logiter *iter, sparkey_hashreader *reader);
uint64_t sparkey_hash_numentries(sparkey_hashreader *reader);
uint64_t sparkey_hash_numcollisions(sparkey_hashreader *reader);
/* util */
/**
* Allocates and creates a string denoting a log file from an index file.
* This is simply a string replacement of .spi$ to .spl$
* @param index_filename the filename representing the index file
* @returns NULL if the index_filename does not end with ".spi"
*/
char * sparkey_create_log_filename(const char *index_filename);
/**
* Allocates and creates a string denoting an index file from a log file.
* This is simply a string replacement of .spl$ to .spi$
* @param log_filename the filename representing the log file
* @returns NULL if the log_filename does not end with ".spl"
*/
char * sparkey_create_index_filename(const char *log_filename);
#ifdef __cplusplus
}
#endif
#endif

92
src/sparkey/util.c Normal file
View File

@@ -0,0 +1,92 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#include <stdio.h>
#include <string.h>
#include "util.h"
#include "sparkey.h"
#include <stdlib.h>
#include <errno.h>
sparkey_returncode sparkey_open_returncode(int e) {
switch (e) {
case EPERM:
case EACCES: return SPARKEY_PERMISSION_DENIED;
case ENFILE: return SPARKEY_TOO_MANY_OPEN_FILES;
case ENOENT: return SPARKEY_FILE_NOT_FOUND;
case EOVERFLOW: return SPARKEY_FILE_TOO_LARGE;
default:
fprintf(stderr, "_sparkey_open_returncode():%d error: errno = %d\n", __LINE__, e);
return SPARKEY_INTERNAL_ERROR;
}
}
sparkey_returncode sparkey_create_returncode(int e) {
switch (e) {
case EPERM:
case EROFS:
case EACCES: return SPARKEY_PERMISSION_DENIED;
case EEXIST: return SPARKEY_FILE_ALREADY_EXISTS;
case EISDIR: return SPARKEY_FILE_IS_DIRECTORY;
case ENFILE:
case EMFILE: return SPARKEY_TOO_MANY_OPEN_FILES;
default:
fprintf(stderr, "_sparkey_create_returncode():%d error: errno = %d\n", __LINE__, e);
return SPARKEY_INTERNAL_ERROR;
}
}
sparkey_returncode sparkey_remove_returncode(int e) {
switch (e) {
case EPERM:
case EROFS:
case EACCES: return SPARKEY_PERMISSION_DENIED;
case EBUSY: return SPARKEY_FILE_BUSY; // Can't happen on linux
case EISDIR: return SPARKEY_FILE_IS_DIRECTORY;
case EOVERFLOW: return SPARKEY_FILE_TOO_LARGE;
default:
fprintf(stderr, "_sparkey_remove_returncode():%d error: errno = %d\n", __LINE__, e);
return SPARKEY_INTERNAL_ERROR;
}
}
static inline char * _create_filename(const char *input, const char *from, char to) {
if (input == NULL) return NULL;
size_t l = strlen(input);
// Paranoia - avoid ridiculously long filenames.
if (l > 10000) return NULL;
// Too short to contain from
if (l < strlen(from)) return NULL;
if (memcmp(&input[l - strlen(from)], from, strlen(from))) return NULL;
char *output = strdup(input);
if (output == NULL) return NULL;
output[l - 1] = to;
return output;
}
char * sparkey_create_log_filename(const char *index_filename) {
return _create_filename(index_filename, ".spi", 'l');
}
char * sparkey_create_index_filename(const char *log_filename) {
return _create_filename(log_filename, ".spl", 'i');
}

85
src/sparkey/util.h Normal file
View File

@@ -0,0 +1,85 @@
/*
* Copyright (c) 2012-2013 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
#ifndef SPARKEY_UTIL_H_INCLUDED
#define SPARKEY_UTIL_H_INCLUDED
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include "sparkey.h"
/**
* This macro sort of behaves like a new keyword.
* Input must be an expression that returns sparkey_returncode.
* The macro must be executed in a function that returns sparkey_returncode.
* If the expression evaluates to something else than SPARKEY_SUCCESS,
* then the containing function returns that value directly.
*/
#define RETHROW(f) do { sparkey_returncode returncode = (f); if (returncode != SPARKEY_SUCCESS) return returncode; } while (0);
/**
* This macro requires that a sparkey_returncode returncode; is already defined in the function.
* It evaluates the first argument which must return sparkey_returncode.
* If that evaluates to something else than SPARKEY_SUCCESS,
* it sets the returncode to that and jumps to the specified label.
*/
#define TRY(f, label) do { returncode = (f); if (returncode != SPARKEY_SUCCESS) goto label; } while (0);
/**
* Convert error codes generated by open and fopen into sparkey return codes.
* @param e an error code
* @returns a sparkey_returncode corresponding to the error, or SPARKEY_INTERNAL_ERROR
*/
sparkey_returncode sparkey_open_returncode(int e);
/**
* Convert error codes generated by creat into sparkey return codes.
* @param e an error code
* @returns a sparkey_returncode corresponding to the error, or SPARKEY_INTERNAL_ERROR
*/
sparkey_returncode sparkey_create_returncode(int e);
/**
* Convert error codes generated by remove or unlink into sparkey return codes.
* @param e an error code
* @returns a sparkey_returncode corresponding to the error, or SPARKEY_INTERNAL_ERROR
*/
sparkey_returncode sparkey_remove_returncode(int e);
/**
* Fetches a 32 bit unsigned value from a pseudorandom source.
*
* @param output a pointer to an uint32_t where the random value is written
* @returns a sparkey_returncode SPARKEY_SUCCESS or, in case of error SPARKEY_INTERNAL_ERROR.
*/
static inline sparkey_returncode rand32(uint32_t *output) {
int fd = open("/dev/urandom", O_RDONLY);
if (fd < 0) {
return SPARKEY_INTERNAL_ERROR;
}
int actual = read(fd, output, 4);
close(fd);
if (actual < 4) {
return SPARKEY_INTERNAL_ERROR;
}
return SPARKEY_SUCCESS;
}
#endif