Al
|
d0364ab6fb
|
[expand] adding method for checking phrase is in multiple dictionaries, and a helper method for determining whether an address phrase has a canonical interpretation
|
2017-12-17 03:14:00 -05:00 |
|
Al
|
cfa5b1ce42
|
[similarity] adding a stopword-aware acronym alignment method for matching U.N. with United Nations, Museum of Modern Art with MoMA, as well as things like University of California - Los Angeles with UCLA. All of these should work across languages, including non-Latin character sets like Cyrllic (but not ideograms as the concept doesn't make as much sense there). Skipping tokens like "of" or "the" depends only on the stopwords dictionary being defined for a given language.
|
2017-12-04 15:21:44 -05:00 |
|
Al
|
2d6079b06f
|
[expand] added search_address_dictionaries_substring to support the new use case (i.e. returns "does this substring in the trie?" regardless of if it's stored under the special prefixes/suffixes namespaces)
|
2017-10-28 02:40:14 -04:00 |
|
Iestyn Pryce
|
ecd07b18c1
|
Fix log_* formats which expect size_t but receive uint32_t.
|
2017-05-19 22:31:56 +01:00 |
|
Al
|
b320aed9ac
|
[merge] merging master
|
2017-01-13 19:58:49 -05:00 |
|
Al
|
df89387b5c
|
[fix] calloc instead of malloc when performing initialization on structs that may fail halfway and need to clean up while partially initialized (calloc will set all the bytes to zero so the member pointers are NULL instead of garbage memory)
|
2017-01-13 18:30:04 -05:00 |
|
Al
|
4cdd245dc2
|
[logging] log error in address_dictionary_get_expansions
|
2016-12-26 16:16:26 -05:00 |
|
Al
|
eea11beb6a
|
[expansion] using easier-to-access data structure for address dictionaries
|
2016-11-27 00:56:48 -08:00 |
|
Al
|
0bc3550c11
|
[expansion] Adding address_expansion_in_dictionary
|
2016-04-29 13:23:48 -04:00 |
|
Al
|
943cd4443a
|
[fix] Log errors if address dictionaries not loaded
|
2016-03-21 18:13:14 -04:00 |
|
Al
|
d35f97f6f1
|
[fix] All file_read_uint64 calls that use stack variables read into a uint64_t not a size_t so as not to smash the stack under a 32-bit arch (issue #18)
|
2016-02-29 22:36:00 -05:00 |
|
Al
|
83c6a87ab1
|
[build] substitution for use of LIBPOSTAL_DATA_DIR in Makefile.am
|
2015-10-26 18:47:07 -04:00 |
|
Al
|
12816d0e95
|
[api] Setting global objects to NULL on teardown
|
2015-09-28 17:27:57 -04:00 |
|
Al
|
e62c75b9c6
|
[phrases] Adding _with_phrases versions of address dictionary methods for pre-allocated phrases
|
2015-09-16 21:24:28 -04:00 |
|
Al
|
e122824448
|
[expansion] Adding the ability to search address dictionary phrases with a NULL language, will return phrases in any language
|
2015-09-15 14:00:26 -04:00 |
|
Al
|
de5d6945b5
|
[expansion] Adding search_address_dictionaries_prefix/suffix for concatenated prefixes/suffixes e.g. in Germanic languages. Adding a flag to the address_expansion struct and trie value to denote separability, adding prefix/suffix keys during dictionary creation
|
2015-08-10 16:15:01 -04:00 |
|
Al
|
fe4789a665
|
[fix] compiler warnings
|
2015-07-28 19:14:00 -04:00 |
|
Al
|
243f327928
|
[fix] NULL check
|
2015-07-27 16:32:01 -04:00 |
|
Al
|
71ffdf9cbc
|
[expansion] tokenized version of search_address_dictionaries
|
2015-07-25 13:50:53 -04:00 |
|
Al
|
351c7c8c2e
|
[expansion] Add concatenated suffixes to the suffix keyspace of the address dictionary trie and concatenated prefixes and elisions to the prefix keyspace
|
2015-07-24 16:02:47 -04:00 |
|
Al
|
27af28eacf
|
[expansion] Changes to address_expansion struct to allow for multiple dictionaries per record. Only adding unique canonical strings to the string array
|
2015-07-22 20:35:29 -04:00 |
|
Al
|
f61d993157
|
[expansion] removing the self param from address_dictionary methods, adding search_address_dictionaries method which searches a string for phrases in a particular language
|
2015-07-22 03:51:28 -04:00 |
|
Al
|
157727d249
|
[fix] method name, strlen and fclose
|
2015-07-22 02:15:45 -04:00 |
|
Al
|
c798876b3d
|
[expansion] Address dictionary allocation, I/O, get/set
|
2015-07-21 16:46:15 -04:00 |
|