Al
|
5bbf71ccbb
|
[transliteration] Using breadth-first search for tracking dependencies between transforms, removing Han-Spacedhan since our tokenizer does the equivalent already
|
2015-05-12 18:57:57 -04:00 |
|
Al
|
d5f9d8a29a
|
[mv] unicode_scripts => unicode_properties
|
2015-05-12 12:14:59 -04:00 |
|
Al
|
3814af52ec
|
[transliteration] Python script now implements the full TR-35 spec, including filter rules, which cuts down significantly on the size of the data file and complexity of generating the trie
|
2015-05-12 12:10:15 -04:00 |
|
Al
|
fe044cebef
|
[transliteration] char set mapping for some of the more complicated sets found in CLDR
|
2015-05-10 18:34:53 -04:00 |
|
Al
|
2a69488f9b
|
[fix] for transliteration rules, allowing the parsing of set differencees and arbitrarily nested character set expressions, using non-NUL byte for the empty transition. Adding resulting data file.
|
2015-05-08 17:14:26 -04:00 |
|
Al
|
10ebaf147a
|
[transliteration] literal ^ and $ escaped
|
2015-05-01 19:16:36 -04:00 |
|
Al
|
ff851a464c
|
[fix] escaping curly braces for regex compilation
|
2015-04-30 13:27:17 -04:00 |
|
Al
|
fa43abd8d9
|
[transliteration] For ruleset steps in transliteration, the name is just the step number, which can be appended to the trie as part of the key
|
2015-04-29 14:31:15 -04:00 |
|
Al
|
1c25238af7
|
[fix] string lengths on the various transliteration rules
|
2015-04-27 13:51:35 -04:00 |
|
Al
|
6ebea11640
|
[transliteration] fixing transliteration rules, fixing escape characters, adding sizes to all the strings as they may have null characters
|
2015-04-26 19:47:54 -04:00 |
|
Al
|
be29874f13
|
[transliteration] Parser for CLDR transforms to generate (simple) C transform rules
|
2015-04-25 15:42:21 -04:00 |
|