Commit Graph

5444 Commits

Author SHA1 Message Date
Al
57eaa414ce [revert] reverting the commits from #578, leaving header file in repo for the moment 2023-07-06 01:54:46 -04:00
Al
c76d020c18 [fix] same result running test as a separate step 2023-07-06 01:36:32 -04:00
Al
d979fbb779 [test] trying make check in the same step, to see if that makes a difference 2023-07-06 01:28:49 -04:00
Al
59325c3b13 [test] testing with sse2 disabled to see if the build is working generally 2023-07-06 01:16:22 -04:00
Al
7a448b718d [crf] using 32 bytes for posix_memalign to align blocks of 4 doubles for remez algorithm to fix test which uses an odd-sized context 2023-07-05 21:02:41 -04:00
Al
b65e7d5bce [fix] no sudo on brew on Mac in github actions, just like on a regular machine/in the docs 2023-07-05 20:47:14 -04:00
Al
2b93af09d9 [build] removing travis build 2023-07-05 20:43:18 -04:00
Al
5669372a90 [fix] sudo in github actions for build tool installs 2023-07-05 20:42:50 -04:00
Al
2f20c9359e [github] adding Github action to run tests on mac and ubuntu initially 2023-07-05 20:38:48 -04:00
Al B
dc794b1b64 Merge pull request #631 from madrisan/libpostal_data_syntax
Fix dash syntax error in libpostal_data
2023-07-03 10:36:13 -07:00
Davide Madrisan
dcb63d8768 Fix dash syntax error in libpostal_data
Fix the syntax error reported by dash:

    ./src/libpostal_data: 39: [: ==: unexpected operatora

when the variable DATAMODEL is empty.

Signed-off-by: Davide Madrisan <davide.madrisan@gmail.com>
2023-06-29 14:36:10 +02:00
Al B
32d636f378 Merge pull request #630 from motiejus/patch-1
avoid UB in bit shifts
2023-06-24 19:35:45 -07:00
Motiejus Jakštys
5d77298e88 avoid UB in bit shifts
unsigned char* gets promoted to `int`, which cannot always be shifted by 24 bits.

Justine Tunney blogs about it here: https://justine.lol/endian.html

Example:

```deserialize.c
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>

uint32_t file_deserialize_uint32_ok(unsigned char *buf) {
    return ((uint32_t)buf[0] << 24) | ((uint32_t)buf[1] << 16) | ((uint32_t)buf[2] << 8) | (uint32_t)buf[3];
}

uint32_t file_deserialize_uint32(unsigned char *buf) {
    return (buf[0] << 24) | (buf[1] << 16) | (buf[2] << 8) | buf[3];
}

int main() {
    unsigned char arr[4] = {0xaa, 0xaa, 0xaa, 0xaa};

    printf("%d\n", file_deserialize_uint32_ok((unsigned char*)arr));
    printf("%d\n", file_deserialize_uint32((unsigned char*)arr));
}
```

Output:
```
$ clang-16 -fsanitize=undefined ./deserialize.c -o deserialize && ./deserialize
-1431655766
deserialize.c:10:20: runtime error: left shift of 170 by 24 places cannot be represented in type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior deserialize.c:10:20 in 
-1431655766
```
2023-06-23 12:16:35 +03:00
ddelange
ef215786f1 Unify and clean the unofficial project references 2023-04-29 22:21:18 +02:00
Al B
4c98eaa7dc Merge pull request #625 from ddelange/patch-1
Add support for OpenBLAS
2023-04-18 14:31:49 -04:00
ddelange
6f95677427 Explicit -lopenblas 2023-04-18 12:00:10 +02:00
ddelange
8eb721f6a1 Fix typo 2023-04-18 11:19:31 +02:00
ddelange
0ad268f991 Add support for OpenBLAS 2023-04-18 10:57:53 +02:00
PIT-Development
e2590bca97 docs: fix typos in contributing.md (#622)
* Respect typo

Repeect should be respect

* Update CONTRIBUTING.md

Also include guildelines to guidelines
2023-04-13 08:38:52 +02:00
Al B
9546eacb26 Merge pull request #616 from oskar700/ot-senzing-datamodel
Adding senzing model from @oskar700 and @brianmacy, along with a new MODEL switch in configure
2023-02-19 17:31:01 -05:00
Oskar Thorbjornsson
00568da290 Modifying README and config parameter, based on code review. 2023-02-14 21:02:51 -08:00
Oskar Thorbjornsson
0c0818c683 Update Senzing link. 2023-02-13 17:03:42 -08:00
Oskar Thorbjornsson
a11f33fb3d Add a link to info about Senzing data model. 2023-02-13 13:32:38 -08:00
Oskar Thorbjornsson
c4c636febd Adding directions to the readme on how to download Senzing datamodel. 2023-02-12 18:04:10 -08:00
Oskar Thorbjornsson
ec9e0e341f Enable downloading of Senzing data model. 2023-02-12 17:58:36 -08:00
Zijian
1ff4cafed5 move states abbreviation to toponyms.txt 2023-02-09 17:21:17 +08:00
Zijian
60e626882f add malaysia state abbreviation 2023-02-09 16:41:52 +08:00
Zijian
eafc753260 add malaysia toponyms 2023-02-09 14:56:35 +08:00
Zijian
a157aeee77 add malaysia federal territory dict 2023-02-09 14:28:30 +08:00
Peter Johnson
cb80555b24 Merge pull request #606 from openvenues/docs-dist-clean
docs: remove make distclean
2022-10-30 09:09:22 +13:00
Peter Johnson
a6c17a0510 docs: remove make distclean
The `make distclean` command is not required in this example, which is a fresh clone
2022-10-29 22:00:07 +02:00
Peter Johnson
668b94473a Merge pull request #605 from openvenues/install-docs
Makefile is not available until after configure
2022-10-29 03:16:12 +13:00
Peter Johnson
5c4ef44426 Makefile is not available until after configure
The `make` command fails when run *before* configure, this is confusing users which are not familiar with `automake`.

```bash
make distclean
make: *** No rule to make target `distclean'.  Stop.

make
make: *** No targets specified and no makefile found.  Stop.
```
2022-10-28 12:59:12 +02:00
Peter Johnson
92f504c8c9 Merge pull request #596 from lukeoz/patch-1
Update toponyms.txt
2022-08-29 15:58:03 +02:00
lukeoz
99c5ffa233 Update toponyms.txt
"South Australia" rather than "Southern Australia" for abbreviation "SA"
2022-08-16 11:30:40 +09:30
Al B
544d510db0 Merge pull request #578 from reisub/sse2neon
Use NEON on ARM hardware via sse2neon.h
2022-07-10 20:00:17 -04:00
dependabot[bot]
b74decadd8 Bump lxml from 4.6.3 to 4.9.1 in /scripts
Bumps [lxml](https://github.com/lxml/lxml) from 4.6.3 to 4.9.1.
- [Release notes](https://github.com/lxml/lxml/releases)
- [Changelog](https://github.com/lxml/lxml/blob/master/CHANGES.txt)
- [Commits](https://github.com/lxml/lxml/compare/lxml-4.6.3...lxml-4.9.1)

---
updated-dependencies:
- dependency-name: lxml
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-07-06 19:45:49 +00:00
dependabot[bot]
2cf12869ea Bump ujson from 1.33 to 5.4.0 in /scripts
Bumps [ujson](https://github.com/ultrajson/ultrajson) from 1.33 to 5.4.0.
- [Release notes](https://github.com/ultrajson/ultrajson/releases)
- [Commits](https://github.com/ultrajson/ultrajson/commits/5.4.0)

---
updated-dependencies:
- dependency-name: ujson
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-07-05 21:09:24 +00:00
Dino Kovač
7984270453 Split hardware optimization flags 2022-07-05 19:43:11 +02:00
dependabot[bot]
3e8cfc2f80 Bump numpy from 1.10.4 to 1.22.0 in /scripts
Bumps [numpy](https://github.com/numpy/numpy) from 1.10.4 to 1.22.0.
- [Release notes](https://github.com/numpy/numpy/releases)
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst)
- [Commits](https://github.com/numpy/numpy/compare/v1.10.4...v1.22.0)

---
updated-dependencies:
- dependency-name: numpy
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-06-21 21:16:56 +00:00
Dino Kovač
6064bc6c06 Use NEON on ARM hardware via sse2neon.h
The autoconf changes were adapted from:
https://github.com/glennrp/libpng/blob/libpng16/configure.ac
2022-06-16 15:49:01 +02:00
Al B
a97717f2b9 Merge pull request #567 from kmicklas/config-check
Check HAVE_CONFIG_H in matrix.h
2022-04-18 16:03:29 -04:00
Al
893745f09b [near_dupes] using quadgrams in Latin scripts as well for near dupe hashes 2022-03-25 14:05:03 -04:00
Al
26124ee72f [near_dupes] exposing name_word_hashes directly in the API 2022-03-25 14:04:26 -04:00
Al
0d8e4ec56d [fix] possible acronym for single toke phrases if it's a directional 2022-03-25 12:24:42 -04:00
Al B
6b52d426d4 Merge pull request #574 from mattwigway/m1-build
build instructions for M1 macs (see #551)
2022-03-02 13:27:31 -05:00
Matthew Wigginton Bhagat-Conway
41e76a7c32 build instructions for M1 macs (see #551) 2022-03-02 10:37:50 -05:00
Ken Micklas
b0c1c75209 Check HAVE_CONFIG_H in matrix.h 2021-12-08 09:33:32 -05:00
James Gates
339252c3a1 Modyfied install steps with notes that worked for me.
Thanks for the porject, everyone. I just went through the install process
and thought maybe the mac directions could use a tiny bit of clarification.
I'm by no means familiar enough with the project to know if this is the best way
to convey my experience but I figured I'd give it a shot and maybe it will help
someone in the future.
2021-09-08 13:47:23 -05:00
dixstonz3
221e830e63 Update entrances.txt 2021-07-05 15:24:43 +02:00