Al
|
b320aed9ac
|
[merge] merging master
|
2017-01-13 19:58:49 -05:00 |
|
Al
|
e1f258171f
|
[fix] handle cstring_array_from_char_array where char_array is NULL or 0-length
|
2017-01-13 16:52:41 -05:00 |
|
Al
|
953a26e54e
|
[utils] char_array_add_vjoined to stay consistent (add_* methods NUL termiante)
|
2017-01-09 16:10:07 -05:00 |
|
Al
|
4ad3a52fe1
|
[strings] fix lowercasing in string_utils.c
|
2017-01-01 20:08:34 -05:00 |
|
Al
|
7d6c85aeec
|
[fix] new string tree iterator, don't decrement permutations on rollovers
|
2017-01-01 13:34:08 -05:00 |
|
Al
|
1780c5e053
|
[fix] moving enum
|
2016-12-31 13:01:57 -05:00 |
|
Al
|
475aa3dbfa
|
[strings] fixing and simplifying string tree iterator. This version is inspired by Python's itertools.product (itertoolsmodule.c has so many goodies)
|
2016-12-31 03:22:27 -05:00 |
|
Al
|
58b063b632
|
[strings] making string_tree_iterator_done more meaningful (returns true if the iterator has no paths left to traverse)
|
2016-12-31 00:54:36 -05:00 |
|
Al
|
8978000320
|
[strings] adding latest utf8proc, new functions for utf8_lower (instead of case folding) and utf8_upper, and a utf8_is_whitespace that takes things like tabs into account
|
2016-12-31 00:52:12 -05:00 |
|
Al
|
0284913aa7
|
[utils] ignore initial separators when splitting on delimiter
|
2016-12-26 04:14:20 -05:00 |
|
Al
|
3ac2c93e1c
|
[utils] using renaming char_array_append_vjoined to char_array_add_vjoined to follow convention that add_* calls NUL-terminate while append_* calls do not
|
2016-12-18 15:26:58 -05:00 |
|
Al
|
3939dd0ca6
|
[fix] cstring_array_split calls
|
2016-12-12 11:37:27 -05:00 |
|
Al
|
b1816e9b70
|
[utils] Adding cstring_array_split_ignore_consecutive
|
2016-12-12 11:37:27 -05:00 |
|
Al
|
b639fa5127
|
[utils] string_replace also creates a copy
|
2016-11-30 10:09:33 -08:00 |
|
Al
|
89f6611c4e
|
[strings] string_trim makes a copy rather than modifying the pointer
|
2016-11-28 15:06:07 -08:00 |
|
Al
|
92e66fd60c
|
[utils] string_next_hyphen_index
|
2016-08-16 12:49:52 -04:00 |
|
Al
|
b8d43dc601
|
[fix] cstring_array_split calls
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
b664ab1cea
|
[utils] Adding cstring_array_split_ignore_consecutive
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
98c395d34c
|
[numex] Concatenating a string of numeric expressions with no intervening tokens like Seventeen Eighty or Ten Oh Four
|
2016-02-10 09:21:31 -05:00 |
|
Al
|
7b300639f1
|
[fix] Trie prefix search tail comparison
|
2016-01-17 20:56:37 -05:00 |
|
Al
|
0d5cf0d6d7
|
[utils] char_array_cat_printf was forcing a doubling of the size of the buffer, which is bad if calling many times. Now only initiates a realloc if the char_array is almost full. Also adding cstring_array_from_strings which takes a list of char *s
|
2016-01-06 22:56:01 -05:00 |
|
Al
|
d0aaff1482
|
[utils] string_equals with NULL check
|
2015-12-01 13:12:08 -05:00 |
|
Al
|
40918812e2
|
[normalize] Adding hyphen elimination as a string option (changes tokenization)
|
2015-10-27 13:32:47 -04:00 |
|
Al
|
6428c0ae20
|
[utils] cstring_array_cat
|
2015-10-03 16:00:13 -04:00 |
|
Al
|
3fab0f984f
|
[fix] fixing some compiler warnings, using type-specific abs functions for vector_math
|
2015-09-19 16:11:09 -04:00 |
|
Al
|
35b9122a1a
|
[utils] inlining a few functions
|
2015-09-10 16:33:54 -07:00 |
|
Al
|
0ddf50cb5f
|
[utils] add to feature array with printf syntax
|
2015-09-10 10:24:51 -07:00 |
|
Al
|
b3f89a207a
|
[utils] Version of string_split for single character delimiters which modifies the input string directly rather than creating (essentially) a copy
|
2015-09-09 18:07:31 -07:00 |
|
Al
|
9d2ca08fc2
|
[utils] Adding _copy and _new_copy methods to vectors (the former copies data to a pre-allocated vector, the latter allocates a new vector)
|
2015-09-06 21:01:26 -07:00 |
|
Al
|
a13e5117b5
|
[utils] string_tree_num_strings method
|
2015-08-10 17:46:37 -04:00 |
|
Al
|
064b6b5898
|
[utils] char_array_append_reversed for adding reversed strings without a malloc
|
2015-08-10 16:10:05 -04:00 |
|
Al
|
9b69d1f67a
|
[fix] Removing C++ checks from all but the main API functions
|
2015-08-07 17:15:39 -04:00 |
|
Al
|
3178eda501
|
[utils] string_contains_hyphen method
|
2015-08-02 14:35:18 -06:00 |
|
Al
|
7aee159c0c
|
[utils] string_tree_num_tokens
|
2015-07-27 12:36:34 -04:00 |
|
Al
|
b94526a27b
|
[utils] Making string_trim handle all kinds of UTF-8 whitespace/separators
|
2015-07-27 01:55:46 -04:00 |
|
Al
|
93042761ac
|
[fix] warnings in string_utils.c
|
2015-07-26 23:36:03 -04:00 |
|
Al
|
a67ec44a08
|
[utils] cstring_array_terminate, moving msgpack_utils to separate file
|
2015-07-25 18:41:02 -04:00 |
|
Al
|
2adaf475c2
|
[utils] cstring_array (contiguous) to array of malloc'd strings
|
2015-07-25 12:14:01 -04:00 |
|
Al
|
f713c53993
|
[utils] Adding an option to char_array_add_joined to strip separators for path manipulation
|
2015-07-16 03:49:00 -04:00 |
|
Al
|
d7f73e62f1
|
[utils] Adding cstring_array_clear method
|
2015-07-06 12:48:26 -04:00 |
|
Al
|
b58877ec6c
|
[utils] string_is_lower/string_is_upper method
|
2015-07-01 14:49:22 -04:00 |
|
Al
|
a5dacf3d2b
|
[utils] Adding method to get a particular token alternative from a string tree
|
2015-06-28 15:15:29 -04:00 |
|
Al
|
82e85732c4
|
[fix] Setting codepoint in utf8proc_iterate_reversed
|
2015-06-25 17:20:55 -04:00 |
|
Al
|
bcee9832b3
|
[utils] cstring_array_get_token=>cstring_array_get_string
|
2015-06-25 10:05:35 -04:00 |
|
Al
|
7dd772de0f
|
[fix] implementation of cstring_array_split
|
2015-06-23 02:11:24 -05:00 |
|
Al
|
8520df96c8
|
[utils] utf8 comparison can handle a non-valid UTF-8 sequence e.g. for trie suffix comparison where we may be in the middle of a multi-byte character. Adding a standard utf8_common_prefix method
|
2015-06-12 16:11:40 -04:00 |
|
Al
|
3442b9ad92
|
[utils] require at least one non-space/non-hyphen match in utf8_common_prefix_len_ignore_separators
|
2015-06-12 11:19:37 -04:00 |
|
Al
|
ab5ea6d791
|
[utils] Common prefix-style return value instead of a utf8 strcmp
|
2015-06-11 10:59:51 -04:00 |
|
Al
|
aad5f3edd3
|
[utils] UTF-8 lowercasing and string comparison, including a version which ignores dashes/spaces
|
2015-06-10 18:27:14 -04:00 |
|
Al
|
81be8e771e
|
[numex] regen data file. utf8_is_hyphen requires a character, all other methods use category
|
2015-06-08 21:32:38 -04:00 |
|