Commit Graph

11 Commits

Author SHA1 Message Date
Motiejus Jakštys
5d77298e88 avoid UB in bit shifts
unsigned char* gets promoted to `int`, which cannot always be shifted by 24 bits.

Justine Tunney blogs about it here: https://justine.lol/endian.html

Example:

```deserialize.c
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>

uint32_t file_deserialize_uint32_ok(unsigned char *buf) {
    return ((uint32_t)buf[0] << 24) | ((uint32_t)buf[1] << 16) | ((uint32_t)buf[2] << 8) | (uint32_t)buf[3];
}

uint32_t file_deserialize_uint32(unsigned char *buf) {
    return (buf[0] << 24) | (buf[1] << 16) | (buf[2] << 8) | buf[3];
}

int main() {
    unsigned char arr[4] = {0xaa, 0xaa, 0xaa, 0xaa};

    printf("%d\n", file_deserialize_uint32_ok((unsigned char*)arr));
    printf("%d\n", file_deserialize_uint32((unsigned char*)arr));
}
```

Output:
```
$ clang-16 -fsanitize=undefined ./deserialize.c -o deserialize && ./deserialize
-1431655766
deserialize.c:10:20: runtime error: left shift of 170 by 24 places cannot be represented in type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior deserialize.c:10:20 in 
-1431655766
```
2023-06-23 12:16:35 +03:00
Al
b85ed70674 [utils] adding a function for checking if files exists (yay C), or at least the closest agreed-upon method for it (may return false if the user doesn't have permissions, but that's ok for our purposes here) 2017-03-10 13:39:52 -05:00
Al
b320aed9ac [merge] merging master 2017-01-13 19:58:49 -05:00
Al
a3506131fe [build] adding libpostal_setup_datadir, libpostal_setup_parser_datadir, libpostal_setup_language_classifier_datadir functions for configuring the datadir at runtime 2017-01-09 16:11:26 -05:00
Al
46cd725c13 [math] Generic dense matrix implementation using BLAS calls for matrix-matrix multiplication if available 2016-08-06 00:40:01 -04:00
Al
46b35c5202 [utils] Adding functions to read numeric arrays from files 2016-01-17 20:36:57 -05:00
Al
17f88c3adc [utils] using unsigned ints in file_utils, adding doubles 2015-05-27 16:03:36 -04:00
Al
2d49369e78 [utils] Adding read/write for 64-bit ints to file_utils 2015-05-13 17:51:03 -04:00
Al
a5f7c73374 [utils] is_relative_path 2015-03-11 17:31:08 -04:00
Al
3ed5795cff [fix] fixing some formatting 2015-03-03 12:54:27 -05:00
Al
5216aba1b6 [utils] string utils, file utils, contiguous arrays of strings used for storing tokenized strings, klib for generic hashtables and vectors, antirez's sds for certain types of string building, utf8proc for iterating over utf-8 strings and unicode normalization 2015-03-03 12:33:13 -05:00