Al
|
b7eda37e44
|
[utils] adding utf8_is_digit to string_utils.h
|
2017-10-20 02:46:00 -04:00 |
|
Al
|
1fbc238b60
|
[numex] adding functions to parse and validate a Roman numeral
|
2017-10-20 02:45:32 -04:00 |
|
Al
|
1c5afcafd2
|
[phrases] when skipping/ignoring hyphens in trie search, make sure that the new longer phrase ends at a word boundary (space, hyphen, end of string, etc.)
|
2017-10-20 02:43:39 -04:00 |
|
Al
|
9d2a111286
|
[numex] when parsing numex, bail on rules in whole_tokens_only languages if there are contiguous rules with no right context rules (example: something that wouldn't make sense like VL in Latin)
|
2017-10-20 02:34:30 -04:00 |
|
Al
|
bd477976d1
|
[similarity] string similarity measures for Damerau-Levenshtein and Jaro-Winkler distances. Both operate on unicode points internally for lengths, etc. instead of byte strings and the Levenshtein distance uses only one array instead of needing to store the full matrix of transitions.
|
2017-10-19 04:51:33 -04:00 |
|
Al
|
245aa226e0
|
[utils] function to create an array of uint32_t codepoints from a UTF-8 string, a few bug fixes to string_utils
|
2017-10-19 04:48:50 -04:00 |
|
Al
|
c61007388b
|
[similarity] bug fixes and additional French, Spanish, Italian, and Slavic phonetics
|
2017-10-18 13:31:35 -04:00 |
|
Al
|
3a3aca8490
|
[similarity] adding basic double metaphone implementation
|
2017-10-18 03:59:05 -04:00 |
|
Al
|
2f2d3da722
|
[test] test for utf8_equal_ignore_separators
|
2017-10-14 01:42:08 -04:00 |
|
Al
|
09fbb02042
|
[utils] adding utf8_equal_ignore_separators to string utils
|
2017-10-14 01:36:56 -04:00 |
|
Al
|
f8a808e254
|
[utils] adding utf8_len function for strings, and utf8_is_digit
|
2017-10-12 11:16:53 -04:00 |
|
Al
|
448ca6a61a
|
[merge] merging commit from v1.1
|
2017-10-12 01:41:04 -04:00 |
|
Travis
|
bb277fb326
|
[auto][ci skip] Adding data files from Travis build #268
|
2017-10-10 18:58:10 +00:00 |
|
Al Barrentine
|
e60139757f
|
Merge pull request #257 from mkaranta/patch-1
Add 'bld' as an abbreviation for 'building'
|
2017-10-10 14:42:29 -04:00 |
|
mkaranta
|
c96a042e86
|
Add 'bld' as an abbreviation for 'building'
I noticed this was missing while testing a batch of addresses. Hopefully it doesn't introduce much noise.
|
2017-10-10 14:19:09 -04:00 |
|
Al
|
c984dca459
|
[fix] removing log error for sequences of length 0
|
2017-09-19 23:20:03 -04:00 |
|
Al Barrentine
|
94a0e842e7
|
[fix] typo
|
2017-08-16 15:04:15 -04:00 |
|
Al Barrentine
|
34e2c4772e
|
[code of conduct] adding stronger, more specific language about hate speech in code of conduct
|
2017-08-16 15:03:38 -04:00 |
|
Al Barrentine
|
2bfa8efefb
|
[docs] updating README examples of normalization now that canonical forms are no longer transliterated
|
2017-08-16 12:15:22 -04:00 |
|
Al
|
0c6af2b74c
|
[fix] normalize canonical strings (after expanding abbreviations, concatenated suffixes, etc.) with Latin-ASCII, Latin-ASCII-Simple or simple UTF-8 normalization depending on the options
|
2017-08-03 14:08:05 -06:00 |
|
Al
|
ed011e50d5
|
[docs][ci skip] update contributing section in README
|
2017-08-01 00:27:50 -04:00 |
|
Al
|
caf2415938
|
[fix][ci skip] updates to contributions guide
|
2017-08-01 00:25:36 -04:00 |
|
Al
|
da2affbacb
|
[fix][ci skip] removing repetition in contributing guide
|
2017-08-01 00:13:55 -04:00 |
|
Al
|
2c06f26f3d
|
[docs][ci skip] adding contributing guide for how to submit issues
|
2017-08-01 00:10:40 -04:00 |
|
Al Barrentine
|
6ca6493d0b
|
Merge pull request #231 from michaelkrog/patch-1
Changes front matter of iis.yaml to correct description
|
2017-07-27 11:21:34 -04:00 |
|
Michael Krog
|
a36dcc8b9c
|
Update is.yaml
|
2017-07-27 13:24:54 +02:00 |
|
Al Barrentine
|
7352dc74c6
|
Moving language around in code of conduct
|
2017-07-21 12:58:35 -04:00 |
|
Al Barrentine
|
4cde250463
|
Adding a custom libpostal Code of Conduct
|
2017-07-21 02:35:07 -04:00 |
|
Al Barrentine
|
dab3b95ae1
|
Merge pull request #229 from openvenues/32bit_numex_fix
32-bit safety in numex table loading
|
2017-07-20 18:11:02 -04:00 |
|
Al
|
97044f5a8b
|
[fix] 32-bit safety in numex table loading
|
2017-07-20 17:55:43 -04:00 |
|
Al Barrentine
|
0cb8c61fb0
|
Merge pull request #215 from xiamx/patch-2
Add Elixir language binding to README.md
|
2017-06-05 16:26:11 -04:00 |
|
Mengxuan Xia
|
abcf72be2e
|
Add Elixir language binding to Readme
|
2017-06-05 16:05:19 -04:00 |
|
Al Barrentine
|
50cf14846c
|
Merge pull request #214 from iestynpryce/master
Fix remaining log_* compile format warnings
|
2017-05-30 08:45:28 -04:00 |
|
Iestyn Pryce
|
b96a687182
|
Merge https://github.com/openvenues/libpostal
|
2017-05-29 18:23:03 +01:00 |
|
Travis
|
8dd84b71ba
|
[auto][ci skip] Adding data files from Travis build #250
|
2017-05-24 05:05:06 +00:00 |
|
Al Barrentine
|
e9696e9166
|
Merge pull request #212 from openvenues/bbraunay-master
modified Indonesian dictionary updates
|
2017-05-24 00:54:05 -04:00 |
|
Al
|
1948634bf3
|
[dictionaries] adding a separable prefix for Jl. and Jln. so things like Jl.Utara get separated and expanded
|
2017-05-24 00:26:32 -04:00 |
|
Al
|
3b5b5d8baa
|
[dictionaries] adding ambiguous expansions for all Indonesian abbreviations 1-2 characters as they could also be initials, etc.
|
2017-05-23 18:04:09 -04:00 |
|
Al
|
f507102457
|
[dictionaries] removing English words from Indonesian unit types
|
2017-05-23 18:01:47 -04:00 |
|
Al
|
4b24699e1f
|
[fix] changing national to nasional in Indonesian
|
2017-05-23 18:00:20 -04:00 |
|
Al
|
4df48fb412
|
[dictionaries] moving Kampong to normalize to Kampung in Indonesian, better if there's one canonical form
|
2017-05-23 17:57:38 -04:00 |
|
Al
|
ec79c610eb
|
[dictionaries] removing a few English words and dupes from Indonesian place names
|
2017-05-23 17:55:59 -04:00 |
|
Al
|
77365a56a5
|
[dictionaries] removing no fixed address from Indonesian dictionaries
|
2017-05-23 17:51:15 -04:00 |
|
Al
|
8a35cfcd80
|
[dictionaries] removing level/platform/podium from Indonesian level types
|
2017-05-23 17:50:50 -04:00 |
|
Al
|
364b00da01
|
[dictionaries] separating Mas and Abang
|
2017-05-23 17:46:45 -04:00 |
|
Al
|
83378049ee
|
[dictionaries] remove Doktor from academic degrees in Indonesian dictionaries
|
2017-05-23 17:35:53 -04:00 |
|
Al
|
52593c6374
|
[dictionaries] remove nonprofit from Indonesian company types
|
2017-05-23 17:27:11 -04:00 |
|
Al
|
08524f4b07
|
[dictionaries] moving some of the existing chain stores for Indonesia to the all/chains.txt dictionary
|
2017-05-23 17:25:59 -04:00 |
|
Al
|
18b2fb0ec8
|
Merge branch 'master' of https://github.com/bbraunay/libpostal into bbraunay-master
|
2017-05-23 17:18:37 -04:00 |
|
Iestyn Pryce
|
87cf7b5bca
|
Add portable way of formatting khint_t type (from klib)
|
2017-05-21 11:58:37 +01:00 |
|