Commit Graph

40 Commits

Author SHA1 Message Date
AeroXuk
26ac9ab5c2 Removing EXPORT statements from all source files and most header files, leaving only the exports for the main API in libpostal.h. Modified Makefiles so that all the test apps build without having extra functions exported from libpostal. 2017-11-25 04:35:28 +00:00
AeroXuk
f07ab765cb Adding the export marker to all functions used in tests. 2017-11-20 20:58:37 +00:00
Al
97044f5a8b [fix] 32-bit safety in numex table loading 2017-07-20 17:55:43 -04:00
Iestyn Pryce
73d27caeb9 Fix log_* formats which expect long long uint but receive uint64_t. 2017-05-21 10:57:20 +01:00
Al
f3adde746e [numex] adding ability to handle handle the degree symbol in numex parsing since it's technically a separate token 2017-04-19 20:18:21 -04:00
Al
92051863ba [numex] adding ordinal suffixes themselves to the numex trie so they can be removed from strings 2017-04-18 17:20:02 -04:00
Al
413c584f08 [fix] need to set prev_state to the NULL state in numex parsing after a non-space/non-hyphen is encountered and the previous match, if any, is added to the result array 2017-04-13 16:01:46 -04:00
Al
b464eb6c07 [numex] fix numex parsing when the spelled-out number is followed by a comma or other punctuation 2017-04-11 16:28:33 -04:00
Al
df89387b5c [fix] calloc instead of malloc when performing initialization on structs that may fail halfway and need to clean up while partially initialized (calloc will set all the bytes to zero so the member pointers are NULL instead of garbage memory) 2017-01-13 18:30:04 -05:00
Al
0356b45069 [fix] Log errors in numex module if not loaded 2016-03-21 18:15:53 -04:00
Al
b5807926bc [fix] Using PRId64 in all cases for int64_t printf formatting 2016-03-02 16:47:49 -05:00
Al
d35f97f6f1 [fix] All file_read_uint64 calls that use stack variables read into a uint64_t not a size_t so as not to smash the stack under a 32-bit arch (issue #18) 2016-02-29 22:36:00 -05:00
Federico Mena Quintero
2ae2450db7 [fix] Check the return of malloc() in numex.c 2016-02-25 14:53:27 -06:00
Al
98c395d34c [numex] Concatenating a string of numeric expressions with no intervening tokens like Seventeen Eighty or Ten Oh Four 2016-02-10 09:21:31 -05:00
Al
59cf5bfc62 [numex] Fixing cases with stopwords not attached to a numeric expression 2016-02-10 08:30:01 -05:00
Al
1e65fafaaf [fix] char * 2016-01-30 13:39:36 -05:00
Al
f8de9d8e5a [fix] static methods in numex table loading, mallocs instead of stack variables 2016-01-30 13:25:48 -05:00
Al
deeb8f007e [fix] Check for result.len > 0 in false start continuation numex parsing, plus additional safety check during replacement 2015-12-24 02:26:53 -05:00
Al
2eea999692 [fix] Fixing false start continuations in numex parsing 2015-12-23 19:19:14 -05:00
Al
39e83961ef [fix] Bug in suffix expansion affecting inseparable suffixes like burg as well as ordinal suffixes like first=>1st 2015-12-19 01:30:08 -05:00
Al
e0c0ed2d04 [numex] Return true if numex table already loaded 2015-12-15 14:28:40 -05:00
Al
1a1d74785c [fix] Compiler warnings for casts/printf 2015-10-26 18:52:18 -04:00
Al
b11362ab98 [numex] using module init method for building, otherwise passing NULL path uses the default path 2015-09-16 21:13:05 -04:00
Al
e122824448 [expansion] Adding the ability to search address dictionary phrases with a NULL language, will return phrases in any language 2015-09-15 14:00:26 -04:00
Al
2eb67ad850 [phrases] trie_search_prefixes/trie_search_suffixes now take a length param 2015-08-09 02:01:37 -04:00
Al
5df9e123af [numex] Fix to whole_tokens_only numeric experession parsing where numex was pushing a number onto the stack even on encountering a new rule context even though the token was not completely parsed 2015-08-08 20:49:54 -04:00
Al
df1410da8c [numex] Fixing numex parsing for lone stopwords and certain prefix matches that were getting mistakenly converted e.g. settembre => 7mbre 2015-07-28 18:11:23 -04:00
Al
a16f0dabcb [numex] Fixing hyphen-initial numeric phrases that end the string 2015-07-28 03:28:44 -04:00
Al
d2539f5b57 [numex] Fixing case of hyphen/space-initial phrases in numex, as well as whole token only languages with ordinals 2015-07-27 01:44:33 -04:00
Al
359cd62e20 [numex] Adding a replace_numeric_expressions method (returns NULL if no replacements were made), fixing lengths in situations where two unrelated numbers are joined by a stopword e.g. in the phrase "one and one" the "and" acts as a delimiter vs a phrase where the stopword acts as a joiner like "one hundred and twenty" 2015-07-24 15:31:05 -04:00
Al
4fd4fa7dca [fix] moving int string size constants to string_utils.h 2015-07-02 17:50:09 -04:00
Al
6a8ab48662 [numex] Adding method to get ordinal suffixes, using single representation 2015-06-25 17:28:06 -04:00
Al
5f5efad6ac [numex] Working numex implemenation. Tested on most languages, Germanic, Latin/whole_tokens_only, English concatenated or with separators, French numerals like quatre-vignt-douze, Spanish multiple-token ordinals, Japanese numerals, etc. All looking good 2015-06-12 16:21:36 -04:00
Al
fd1ebba720 [numex] Initial implementation of multilingual numeric expression parser 2015-06-08 21:29:04 -04:00
Al
b244aa30f2 [numex] Setting numex_table to NULL during teardown, adding some logging 2015-06-04 23:57:52 -04:00
Al
3bd5172afd [numex] Adding NUMEX_NULL_RULE at the first index 2015-06-04 17:21:44 -04:00
Al
7d3ef39463 [numex] struct/method changes for new ordinal indicators 2015-06-04 03:15:51 -04:00
Al
2d5d854754 [fix] compilation/warnings 2015-06-02 13:43:55 -04:00
Al
080f382065 [numex] Removing concatenated property from language struct as all numeric spellouts might be concatenated 2015-06-01 17:12:07 -04:00
Al
920e15bd4d [numex] Adding numex setup/IO methods 2015-06-01 15:43:23 -04:00