Al
|
505456d9d2
|
[fix] removing unnecessary header
|
2015-06-01 17:12:33 -04:00 |
|
Al
|
080f382065
|
[numex] Removing concatenated property from language struct as all numeric spellouts might be concatenated
|
2015-06-01 17:12:07 -04:00 |
|
Al
|
a20b768237
|
[numex] Russian numex rules (a start at least, might need a native speaker to review the RBNF transform in CLDR)
|
2015-06-01 17:08:57 -04:00 |
|
Al
|
05ffbffb23
|
[numex] Latin numex rules i.e. Roman numerals, used for most languages
|
2015-06-01 17:08:04 -04:00 |
|
Al
|
028bb5a1aa
|
[numex] German numex rules
|
2015-06-01 17:07:35 -04:00 |
|
Al
|
9bd75cee23
|
[numex] Romance language numex rules (Spanish, French, Italian, Portuguese)
|
2015-06-01 17:07:23 -04:00 |
|
Al
|
99aed992da
|
[numex] English numex rules
|
2015-06-01 17:06:53 -04:00 |
|
Al
|
920e15bd4d
|
[numex] Adding numex setup/IO methods
|
2015-06-01 15:43:23 -04:00 |
|
Al
|
c0347a3431
|
[numex] numex header and structs
|
2015-06-01 15:41:34 -04:00 |
|
Al
|
b74fa0da99
|
[config] Adding config header
|
2015-06-01 15:40:59 -04:00 |
|
Al
|
93172bd16d
|
[transliteration] New transliterator_scripts file
|
2015-05-31 02:09:28 -04:00 |
|
Al
|
0575984144
|
[transliteration] New data file
|
2015-05-31 02:08:26 -04:00 |
|
Al
|
6ac4ff6021
|
[transliteration] Adding reverse/bidirectional transforms e.g. for Katakana-Latin
|
2015-05-31 02:07:36 -04:00 |
|
Al
|
664d5e90db
|
[fix] Removing the stub comment and a few more random comments
|
2015-05-29 20:10:44 -04:00 |
|
Al
|
06318a6fab
|
[fix] logging code
|
2015-05-29 20:08:49 -04:00 |
|
Al
|
55568e9ffa
|
[fix] Removing commented out section
|
2015-05-29 20:01:17 -04:00 |
|
Al
|
583cadd44f
|
[transliteration] transliterate implementation from trie (need to build/save the tables first)
|
2015-05-29 19:59:45 -04:00 |
|
Al
|
6239c2fcfc
|
[transliteration] regenerated data file including InterIndic-Latin dependency
|
2015-05-29 19:48:19 -04:00 |
|
Al
|
9547c93a38
|
[fix] InterIndic-Latin is an internal transliterator, but needed for most of the Indic languages. Also fixing the string lengths for HTML entity replacements
|
2015-05-29 19:47:49 -04:00 |
|
Al
|
8b56d63fde
|
[fix] only count non-set chars in parse_groups
|
2015-05-29 19:42:05 -04:00 |
|
Al
|
a278cfd12c
|
[transliteration] Using revisit strings instead of keeping a backtrack count so we don't have to later map logical characters to the actual string, removing any duplicate keys in the table builder so that if any rules happen to overlap within a step, the first will take precedence
|
2015-05-29 16:54:05 -04:00 |
|
Al
|
a9d5b91ac0
|
[transliteration] Not counting repeat character in group capture
|
2015-05-28 19:36:25 -04:00 |
|
Al
|
0177fd4b13
|
[fix] trie_search using proper length in utf8proc_iterate
|
2015-05-27 16:08:19 -04:00 |
|
Al
|
ad8e92182c
|
[phrases] trie I/O using the uint APIs, fixes to trie_get_prefix_result_from_index
|
2015-05-27 16:06:35 -04:00 |
|
Al
|
897c29ccb8
|
[fix] transliterate.h
|
2015-05-27 16:04:18 -04:00 |
|
Al
|
17f88c3adc
|
[utils] using unsigned ints in file_utils, adding doubles
|
2015-05-27 16:03:36 -04:00 |
|
Al
|
8ac8f83b7f
|
[utils] changing signature of utf8proc_iterate_reversed so it takes the same arguments as utf8proc_iterate for function pointer purposes
|
2015-05-25 15:35:28 -04:00 |
|
Al
|
26ff3292d2
|
[fix] new script name, prefix result
|
2015-05-23 21:41:11 -04:00 |
|
Al
|
31cc2bb5d1
|
[fix] merging repeat codepoints in trie builder
|
2015-05-22 22:45:23 -04:00 |
|
Al
|
c00ecf6ea8
|
[fix] minimizing c* into (c|'')+, using empty transition instead of zero-length string
|
2015-05-22 18:11:54 -04:00 |
|
Al
|
b2d15b29cf
|
[fix] greek_latin_ungegn => greek-latin-ungegn
|
2015-05-22 09:52:08 -04:00 |
|
Al
|
27171e068d
|
[phrases] constant for NULL prefix results
|
2015-05-22 09:08:07 -04:00 |
|
Al
|
cb14e5eef1
|
[phrases] trie_get_prefix_from_index takes an optinal tail position
|
2015-05-21 06:16:14 -04:00 |
|
Al
|
91ccdf6f7b
|
[phrases] trie_get_prefix_* methods return a struct including tail position
|
2015-05-21 05:38:21 -04:00 |
|
Al
|
395fbcb8b5
|
[fix] get_prefix on tries searches tail as well
|
2015-05-21 05:22:44 -04:00 |
|
Al
|
e84f3d93d2
|
[fix] get_prefix on tries searches tail as well
|
2015-05-20 20:57:14 -04:00 |
|
Al
|
c9ff3f278f
|
[transliteration] new transform data file
|
2015-05-20 14:45:16 -04:00 |
|
Al
|
d65f7747f0
|
[transliteration] Adding html escapes as the first step in the Latin-ASCII transformation
|
2015-05-20 14:44:55 -04:00 |
|
Al
|
1fee0a3e35
|
[phrases] separating get_data_node from tail_match for tries
|
2015-05-20 13:51:04 -04:00 |
|
Al
|
bfb9aa21a1
|
[fix] unused var
|
2015-05-19 18:04:06 -04:00 |
|
Al
|
3d25378456
|
[transliteration] fixing a few warnings
|
2015-05-19 18:03:36 -04:00 |
|
Al
|
fdf988cb27
|
[phrases] adding a public get_data_node method for tries
|
2015-05-19 18:02:29 -04:00 |
|
Al
|
9d309ca9d3
|
[fix] moving constant
|
2015-05-18 14:25:21 -04:00 |
|
Al
|
eecee39904
|
[fix] giving constant trie node names more specificity
|
2015-05-18 14:24:39 -04:00 |
|
Al
|
c66f6f0fbe
|
[transliteration] adding begin set token for regex character sets and fixing off-by-one in concatenated trie keys
|
2015-05-18 14:00:14 -04:00 |
|
Al
|
3c1e5c0471
|
[transliteration] new data file with the escaped German transliterations
|
2015-05-18 13:57:45 -04:00 |
|
Al
|
58571f70cc
|
[utils] adding a boolean flag on string tree iterators for single path trees
|
2015-05-18 13:57:11 -04:00 |
|
Al
|
4694371cdc
|
[fix] unicode escaping the German transliterations
|
2015-05-18 13:55:57 -04:00 |
|
Al
|
7eaa94d2fb
|
[transliteration] new data file
|
2015-05-17 18:31:52 -04:00 |
|
Al
|
e25f039ee4
|
[transliteration] Escaped single quotes in rules + ignoring rules with codepoints > \uffff
|
2015-05-17 18:31:35 -04:00 |
|