Commit Graph

596 Commits

Author SHA1 Message Date
Al
bbaa302e2e [fix] NUMEX_STOPWORD_RULE define 2015-08-09 01:03:23 -04:00
Al
5383640c14 [fix] cast 2015-08-09 01:01:11 -04:00
Al
dd391eabe5 [numex] Separating rules from keys for Linux gcc compilation 2015-08-09 01:00:57 -04:00
Al
e346b831cb [build] public-read permissions when uploading to S3 2015-08-09 00:17:04 -04:00
Al
ad584671c4 [build] Not compiling with -Werror for now 2015-08-09 00:02:41 -04:00
Al
f170f70727 [build] Link to math library 2015-08-09 00:01:44 -04:00
Al
423e2c86c7 [build] builder programs are now in noinst_PROGRAMS, Makefile target to upload data tarball to S3 (with proper credentials) 2015-08-08 23:29:34 -04:00
Al
a5ce1f12dd [fix] stdint header in address expansion rule generation script 2015-08-08 23:28:11 -04:00
Al
ee982cd872 [dictionaries] Removing dictionaries/all/personal_suffixes, can add to languages as needed 2015-08-08 23:13:09 -04:00
Al
5acf7a4f3e [phrases] resetting node position when continuation falls off the trie 2015-08-08 22:18:05 -04:00
Al
a77c8e1321 [build] Adding bootstrap.sh script and removing configure from version control 2015-08-08 21:22:11 -04:00
Al
cd0f95f9e2 [fix] making transliteration path relative to data dir 2015-08-08 21:06:02 -04:00
Al
2ba0e814ad [build] better autoconf checks for time and dirent headers 2015-08-08 21:02:03 -04:00
Al
d0679450e3 [config] Including Autoconf config.h in internal config 2015-08-08 20:50:23 -04:00
Al
5df9e123af [numex] Fix to whole_tokens_only numeric experession parsing where numex was pushing a number onto the stack even on encountering a new rule context even though the token was not completely parsed 2015-08-08 20:49:54 -04:00
Al
53f54d6454 [fix] removing comment 2015-08-08 20:23:49 -04:00
Al
2106a6cfe4 [build] Adding command-line test and bench programs 2015-08-08 19:44:50 -04:00
Al
5aa2e99b92 [fix] data dir for tar extraction 2015-08-08 19:42:37 -04:00
Al
54aa6fe7df [build] Fixing runtime check/save of last updated file for package data tarball 2015-08-08 17:16:03 -04:00
Al
f38a53601b [rm] Better not to keep that file in the repo 2015-08-08 02:41:54 -04:00
Al
770f44198c [build] Adding default file to track last updated date 2015-08-08 02:30:42 -04:00
Al
c0c21b81f2 [build] Adding generated configure script 2015-08-07 17:35:44 -04:00
Al
a197d04b1a [fix] float comparison 2015-08-07 17:28:21 -04:00
Al
f161f68d53 [build] Changes to Makefile.am to build on Debian/Ubuntu, fixing downloading of the data tarball for Mac and Linux 2015-08-07 17:27:34 -04:00
Al
9b69d1f67a [fix] Removing C++ checks from all but the main API functions 2015-08-07 17:15:39 -04:00
Al
359a1efb03 [fix] Adding stdint.h include to most of the header files for portability 2015-08-07 02:43:44 -04:00
Al
0738a57caa [fix] restoring ctype.h include 2015-08-07 01:52:08 -04:00
Al
06d2e916a1 [fix] includes, matters on GCC/Linux 2015-08-07 01:51:34 -04:00
Al
ae9825b9f9 [build] Fixing data dir download in Automake file 2015-08-07 01:51:06 -04:00
Al
d7ebcd046e [fix] includes 2015-08-07 01:00:26 -04:00
Al
f246c2ee95 [api] Adding address component constants to libpostal.h, returning char ** instead of a cstring_array to simplify API/dependencies 2015-08-06 17:52:54 -04:00
Al
61d586fa1d [config] config.h=>libpostal_config.h so as not to conflict with autoconf 2015-08-06 17:50:55 -04:00
Al
2bedb695a2 [build] adding Automake file in src, including rule to download data dir tarball 2015-08-06 17:48:37 -04:00
Al
4b9f11eca5 [build] Main Automake file and modified version of Sparkey's Automake file 2015-08-06 02:14:33 -04:00
Al
fe078cff66 [build] Adding Autoconf file 2015-08-06 02:13:43 -04:00
Al
1d39916aaa [fix] Fixing warnings in unicode script data 2015-08-02 21:30:54 -06:00
Al
770ce4256f [expansion] Re-generating address expansion data file 2015-08-02 21:30:19 -06:00
Al
90cde298dd [dictionaries] condensed forms of sin numero in various languages 2015-08-02 21:19:55 -06:00
Al
753c6efb1d [api] Initial libpostal API, combining string normalization, transliteration, numex and address dictionaries 2015-08-02 21:16:18 -06:00
Al
b27030e39f [fix] tokenized trie search was skipping tokens in some cases 2015-08-02 14:36:21 -06:00
Al
3178eda501 [utils] string_contains_hyphen method 2015-08-02 14:35:18 -06:00
Al
46141a6c36 [normalize] Adding an option when normalizing tokens to split tokens of the form [\w]+[\.\-]?[\d]+ for cases like I35, CR123, R-66, RN.7, etc. where the alpha component is an expansion 2015-08-02 14:34:36 -06:00
Al
f10dd49c58 [expansion] NULL_CANONICAL_INDEX constant 2015-08-01 23:59:16 -06:00
Al
6bf563ca89 [dictionaries] Italian abbreviations for strada 2015-07-28 19:15:30 -04:00
Al
fe4789a665 [fix] compiler warnings 2015-07-28 19:14:00 -04:00
Al
551904d202 [normalize] cstring_array instead of string_tree for token-based normalization 2015-07-28 19:09:50 -04:00
Al
90d4da9e72 [geodb] Adding an is_canonical bit field to geodb trie values 2015-07-28 19:08:24 -04:00
Al
9bc902f575 [numex] LATIN_LANGUAGE_CODE constant for Roman numeral normalization 2015-07-28 18:12:12 -04:00
Al
df1410da8c [numex] Fixing numex parsing for lone stopwords and certain prefix matches that were getting mistakenly converted e.g. settembre => 7mbre 2015-07-28 18:11:23 -04:00
Al
a16f0dabcb [numex] Fixing hyphen-initial numeric phrases that end the string 2015-07-28 03:28:44 -04:00