Al
|
ca746304e3
|
[utils] Adding a few methods to string_utils for finding utf8proc category groups
|
2015-06-04 13:20:14 -04:00 |
|
Al
|
eac7a296ba
|
[numex] New numex data file including top 15 languages in OSM
|
2015-06-04 11:55:07 -04:00 |
|
Al
|
d98c535c52
|
[numex] Adding ordinal indicator to enum
|
2015-06-04 11:25:24 -04:00 |
|
Al
|
3cb8b2d297
|
[numex] trie builder adding a separate suffix-based namespace for looking up ordinal indicators
|
2015-06-04 03:17:03 -04:00 |
|
Al
|
7d3ef39463
|
[numex] struct/method changes for new ordinal indicators
|
2015-06-04 03:15:51 -04:00 |
|
Al
|
65abde908b
|
[numex] New numex data file
|
2015-06-04 03:10:00 -04:00 |
|
Al
|
3d95875a11
|
[phrases] trie_add_len
|
2015-06-04 02:41:48 -04:00 |
|
Al
|
fa784677f2
|
[phrases] trie_add_suffix_at_index method
|
2015-06-04 02:30:53 -04:00 |
|
Al
|
9bdf118423
|
[transliteration] Fix to transliteration in cases where the pre/post context doesn't match and we fall back to the no-context match
|
2015-06-03 22:58:29 -04:00 |
|
Al
|
48d2ca31c4
|
[transliteration] New ggenerated data file with the German/Scandinavian additions
|
2015-06-03 22:56:50 -04:00 |
|
Al
|
760714a234
|
[fix] warnings in transliterate.c
|
2015-06-03 19:29:35 -04:00 |
|
Al
|
7dcb4bf6f4
|
[numex] correct signature
|
2015-06-02 16:08:25 -04:00 |
|
Al
|
93d65d0186
|
[numex] numex table builder, fix to constant
|
2015-06-02 13:57:34 -04:00 |
|
Al
|
a44997c71c
|
[fix] new generated numex data file
|
2015-06-02 13:45:06 -04:00 |
|
Al
|
2d5d854754
|
[fix] compilation/warnings
|
2015-06-02 13:43:55 -04:00 |
|
Al
|
208366af98
|
[fix] removing stopwords index
|
2015-06-02 12:43:48 -04:00 |
|
Al
|
49816382c1
|
[numex] New generated data file
|
2015-06-02 12:37:39 -04:00 |
|
Al
|
9d0d83bc14
|
[numex] adding stopword rules with the regular numex rules
|
2015-06-02 12:37:22 -04:00 |
|
Al
|
816a0408ab
|
[numex] numex_rule.h
|
2015-06-02 12:30:56 -04:00 |
|
Al
|
8ef3a50b79
|
[numex] Initial generated numex data file
|
2015-06-02 12:28:28 -04:00 |
|
Al
|
4ad978f22c
|
[numex] Using the new representation for generated data
|
2015-06-02 12:28:07 -04:00 |
|
Al
|
958c219b88
|
[utils] constants.h
|
2015-06-02 12:26:19 -04:00 |
|
Al
|
505456d9d2
|
[fix] removing unnecessary header
|
2015-06-01 17:12:33 -04:00 |
|
Al
|
080f382065
|
[numex] Removing concatenated property from language struct as all numeric spellouts might be concatenated
|
2015-06-01 17:12:07 -04:00 |
|
Al
|
920e15bd4d
|
[numex] Adding numex setup/IO methods
|
2015-06-01 15:43:23 -04:00 |
|
Al
|
c0347a3431
|
[numex] numex header and structs
|
2015-06-01 15:41:34 -04:00 |
|
Al
|
b74fa0da99
|
[config] Adding config header
|
2015-06-01 15:40:59 -04:00 |
|
Al
|
93172bd16d
|
[transliteration] New transliterator_scripts file
|
2015-05-31 02:09:28 -04:00 |
|
Al
|
0575984144
|
[transliteration] New data file
|
2015-05-31 02:08:26 -04:00 |
|
Al
|
664d5e90db
|
[fix] Removing the stub comment and a few more random comments
|
2015-05-29 20:10:44 -04:00 |
|
Al
|
06318a6fab
|
[fix] logging code
|
2015-05-29 20:08:49 -04:00 |
|
Al
|
55568e9ffa
|
[fix] Removing commented out section
|
2015-05-29 20:01:17 -04:00 |
|
Al
|
583cadd44f
|
[transliteration] transliterate implementation from trie (need to build/save the tables first)
|
2015-05-29 19:59:45 -04:00 |
|
Al
|
6239c2fcfc
|
[transliteration] regenerated data file including InterIndic-Latin dependency
|
2015-05-29 19:48:19 -04:00 |
|
Al
|
8b56d63fde
|
[fix] only count non-set chars in parse_groups
|
2015-05-29 19:42:05 -04:00 |
|
Al
|
a278cfd12c
|
[transliteration] Using revisit strings instead of keeping a backtrack count so we don't have to later map logical characters to the actual string, removing any duplicate keys in the table builder so that if any rules happen to overlap within a step, the first will take precedence
|
2015-05-29 16:54:05 -04:00 |
|
Al
|
a9d5b91ac0
|
[transliteration] Not counting repeat character in group capture
|
2015-05-28 19:36:25 -04:00 |
|
Al
|
0177fd4b13
|
[fix] trie_search using proper length in utf8proc_iterate
|
2015-05-27 16:08:19 -04:00 |
|
Al
|
ad8e92182c
|
[phrases] trie I/O using the uint APIs, fixes to trie_get_prefix_result_from_index
|
2015-05-27 16:06:35 -04:00 |
|
Al
|
897c29ccb8
|
[fix] transliterate.h
|
2015-05-27 16:04:18 -04:00 |
|
Al
|
17f88c3adc
|
[utils] using unsigned ints in file_utils, adding doubles
|
2015-05-27 16:03:36 -04:00 |
|
Al
|
8ac8f83b7f
|
[utils] changing signature of utf8proc_iterate_reversed so it takes the same arguments as utf8proc_iterate for function pointer purposes
|
2015-05-25 15:35:28 -04:00 |
|
Al
|
26ff3292d2
|
[fix] new script name, prefix result
|
2015-05-23 21:41:11 -04:00 |
|
Al
|
31cc2bb5d1
|
[fix] merging repeat codepoints in trie builder
|
2015-05-22 22:45:23 -04:00 |
|
Al
|
c00ecf6ea8
|
[fix] minimizing c* into (c|'')+, using empty transition instead of zero-length string
|
2015-05-22 18:11:54 -04:00 |
|
Al
|
b2d15b29cf
|
[fix] greek_latin_ungegn => greek-latin-ungegn
|
2015-05-22 09:52:08 -04:00 |
|
Al
|
27171e068d
|
[phrases] constant for NULL prefix results
|
2015-05-22 09:08:07 -04:00 |
|
Al
|
cb14e5eef1
|
[phrases] trie_get_prefix_from_index takes an optinal tail position
|
2015-05-21 06:16:14 -04:00 |
|
Al
|
91ccdf6f7b
|
[phrases] trie_get_prefix_* methods return a struct including tail position
|
2015-05-21 05:38:21 -04:00 |
|
Al
|
395fbcb8b5
|
[fix] get_prefix on tries searches tail as well
|
2015-05-21 05:22:44 -04:00 |
|