Commit Graph

21 Commits

Author SHA1 Message Date
Al
b320aed9ac [merge] merging master 2017-01-13 19:58:49 -05:00
Al
a3506131fe [build] adding libpostal_setup_datadir, libpostal_setup_parser_datadir, libpostal_setup_language_classifier_datadir functions for configuring the datadir at runtime 2017-01-09 16:11:26 -05:00
Al
d8d3840700 [transliteration] constant for the html-escape transliterator 2017-01-02 00:40:12 -05:00
Al
2644fed18f [transliteration] Adding LATIN_ASCII_SIMPLE constant to transliterate.h 2016-08-21 19:42:10 -04:00
Al
1a1d74785c [fix] Compiler warnings for casts/printf 2015-10-26 18:52:18 -04:00
Al
d5ec005787 [transliteration] Similar init method for transliteration 2015-09-16 21:14:02 -04:00
Al
51572d6575 [phrases] Changing prefix/suffix chars so both are control characters and neither is the NUL-byte. Modifying transliteration special characters accordingly 2015-08-10 16:01:22 -04:00
Al
cd0f95f9e2 [fix] making transliteration path relative to data dir 2015-08-08 21:06:02 -04:00
Al
359a1efb03 [fix] Adding stdint.h include to most of the header files for portability 2015-08-07 02:43:44 -04:00
Al
fa643f7a3a [utf8] Moving language length constant 2015-06-30 19:17:20 -04:00
Al
246237c1f1 [transliteration] Adding a get_transliteration_table() to foreach_transliterator macro since it lives in the header 2015-06-28 15:14:49 -04:00
Al
bcee9832b3 [utils] cstring_array_get_token=>cstring_array_get_string 2015-06-25 10:05:35 -04:00
Al
c3143e5291 [transliteration] Adding structs/header stuff for transliterator lookup by script/language 2015-06-23 15:34:38 -05:00
Al
2e54ca3575 [transliteration] including script data file, adding len to transliterate API for tokenized transliteration 2015-06-21 05:42:20 -05:00
Al
4ad978f22c [numex] Using the new representation for generated data 2015-06-02 12:28:07 -04:00
Al
505456d9d2 [fix] removing unnecessary header 2015-06-01 17:12:33 -04:00
Al
a278cfd12c [transliteration] Using revisit strings instead of keeping a backtrack count so we don't have to later map logical characters to the actual string, removing any duplicate keys in the table builder so that if any rules happen to overlap within a step, the first will take precedence 2015-05-29 16:54:05 -04:00
Al
897c29ccb8 [fix] transliterate.h 2015-05-27 16:04:18 -04:00
Al
31cc2bb5d1 [fix] merging repeat codepoints in trie builder 2015-05-22 22:45:23 -04:00
Al
1348cc8906 [transliteration] Switching the begin/end set chars 2015-05-17 12:02:46 -04:00
Al
b983a83a89 [transliteration] transliteration struct definitions, memory allocaiton, builder methods and I/O, stubbing transliterate method for the moment 2015-05-16 23:23:25 -04:00