Al
|
62017fd33d
|
[optimization] Using sparse updates in stochastic gradient descent. Decomposing the updates into the gradient of the loss function (zero for features not observed in the current batch) and the gradient of the regularization term. The derivative of the regularization term in L2-regularized models is equivalent to an exponential decay function. Before computing the gradient for the current batch, we bring the weights up to date only for the features observed in that batch, and update only those values
|
2016-01-09 03:37:31 -05:00 |
|
Al
|
aa22db11b2
|
[math] Matrix arithmetic
|
2016-01-09 01:45:10 -05:00 |
|
Al
|
197b18f3cf
|
[fix] NULL check
|
2016-01-09 01:43:25 -05:00 |
|
Al
|
9c4b5ccbb1
|
[math] Adding array_{op}_times_scalar methods
|
2016-01-09 01:42:54 -05:00 |
|
Al
|
2f1e2139ca
|
[math] Unique columns as array for CSR sparse matrix
|
2016-01-09 01:40:26 -05:00 |
|
Al
|
023c04d78f
|
[classification] Pre-allocating memory in logistic regression trainer, storing last updated timestamps for sparse stochastic gradient descent and using the new gradient API
|
2016-01-09 01:39:24 -05:00 |
|
Al
|
562cc06eaf
|
[classification] Sparse version of logistic regression gradient which, given an array of the features/columns used in the input batch, only updates the gradient for that batch, even for the operations which otherwise would apply to the entire matrix (scaling by -1/m, regularization)
|
2016-01-09 01:33:33 -05:00 |
|
Al
|
5ca4bba1d5
|
[fix] Writing matrix dimension as 64-bit
|
2016-01-08 01:29:52 -05:00 |
|
Al
|
8f054eeeb1
|
[classification] Training structures for logistic regression and stochastic (minibatch) gradient descent update
|
2016-01-08 01:07:20 -05:00 |
|
Al
|
4acf10c3a4
|
[classification] Multinomial logistic regression, gradient and cost function
|
2016-01-08 01:03:09 -05:00 |
|
Al
|
8b70529711
|
[optimization] Stochastic gradient descent with gain schedule a la Leon Bottou
|
2016-01-08 00:54:17 -05:00 |
|
Al
|
6b164d263e
|
[math] Sparse matrix from dense
|
2016-01-08 00:48:57 -05:00 |
|
Al
|
ba8fc716df
|
[features] Functions for dealing with minibatches
|
2016-01-08 00:48:11 -05:00 |
|
Al
|
06638d2885
|
[fix] only strdup when necessary in feature counting functions
|
2016-01-08 00:46:41 -05:00 |
|
Al
|
31a3a2a3fa
|
[math] Matrix scalar arithmetic functions
|
2016-01-08 00:44:33 -05:00 |
|
Al
|
b6ce94166b
|
[sparse] Only increase size of sparse matrix on finalize row if it needs to be
|
2016-01-07 13:19:22 -05:00 |
|
Al
|
2e67afab09
|
[fix] adding functions to string_utils header
|
2016-01-06 23:03:16 -05:00 |
|
Al
|
a8b9a2c153
|
[fix] making *_hash_sort_keys_by_value static
|
2016-01-06 23:01:00 -05:00 |
|
Al
|
0d5cf0d6d7
|
[utils] char_array_cat_printf was forcing a doubling of the size of the buffer, which is bad if calling many times. Now only initiates a realloc if the char_array is almost full. Also adding cstring_array_from_strings which takes a list of char *s
|
2016-01-06 22:56:01 -05:00 |
|
Al
|
8c019998d7
|
[phrases] trie_num_keys
|
2016-01-05 22:02:15 -05:00 |
|
Al
|
22668945cb
|
[mv] Moving trie_new_from_hash to a module
|
2016-01-05 16:43:17 -05:00 |
|
Al
|
33e9a05ebf
|
[tokenization] is_whitespace
|
2016-01-05 16:40:35 -05:00 |
|
Al
|
6e1435ac48
|
[features] No copy versions of feature counts functions
|
2016-01-05 16:39:50 -05:00 |
|
Al
|
a740417cab
|
[utils] Adding hash sort by values for numeric types
|
2016-01-05 14:47:48 -05:00 |
|
Al
|
6ef7c90278
|
[fix] using string_equals, handles NULLs
|
2016-01-05 14:08:10 -05:00 |
|
Al
|
c0214d6023
|
[fix] free normalized string in address parser data set
|
2016-01-05 14:06:03 -05:00 |
|
Al
|
6a5ad96a17
|
[math] Adding vector sort and vector argsort to numeric vectors
|
2016-01-05 14:05:27 -05:00 |
|
Al
|
7aea79281e
|
[math] Floating point equality with relative epsilon comparisons
|
2016-01-02 15:39:49 -05:00 |
|
Al
|
780966a59b
|
[api] More spacing fixes and using language information in normalize string
|
2015-12-31 03:52:14 -05:00 |
|
Al
|
ff75c5cc50
|
[normalize] Adding normalize_string_languages method which can use additional transliterators
|
2015-12-31 03:50:36 -05:00 |
|
Al
|
9335d26fbd
|
[fix] spacing
|
2015-12-31 02:26:28 -05:00 |
|
Al
|
1b0567a881
|
[fix] Ubuntu build
|
2015-12-28 17:19:50 -05:00 |
|
Al
|
77ccd975c4
|
[fix] #endif
|
2015-12-28 17:03:12 -05:00 |
|
Al
|
d0b5985cb7
|
[build] Adding /usr/local/lib and /usr/local/include to sparkey build
|
2015-12-28 16:56:10 -05:00 |
|
Al
|
45b5e2dd6f
|
[fix] array_zero
|
2015-12-28 01:24:27 -05:00 |
|
Al
|
fb4c984f15
|
[math] sparse_matrix_new_shape
|
2015-12-28 01:20:23 -05:00 |
|
Al
|
72ad01cbc3
|
[features] Using a str=>double hashtable for feature counts
|
2015-12-28 01:18:49 -05:00 |
|
Al
|
e4dba2297d
|
[mv] Moving token type checking to header
|
2015-12-28 01:17:33 -05:00 |
|
Al
|
0fa1c2389c
|
[fix] Leak in expanding strings that have a separable prefix and suffix, other than that ran through 78 million expansions with no discernable memory issues
|
2015-12-26 17:19:59 -05:00 |
|
Al
|
deeb8f007e
|
[fix] Check for result.len > 0 in false start continuation numex parsing, plus additional safety check during replacement
|
2015-12-24 02:26:53 -05:00 |
|
Al
|
507dd631f8
|
[build] Adding json_encode.c to the address parser client sources
|
2015-12-23 19:37:28 -05:00 |
|
Al
|
5e6d24ff7e
|
[unicode] Upgrading to latest utf8proc from JuliaLang (Unicode 8)
|
2015-12-23 19:33:09 -05:00 |
|
Al
|
3fbb3c587a
|
[fix] using a char_array instead of copying the string in normalize_string
|
2015-12-23 19:21:54 -05:00 |
|
Al
|
2eea999692
|
[fix] Fixing false start continuations in numex parsing
|
2015-12-23 19:19:14 -05:00 |
|
Al
|
850d82de6e
|
[fix] In trie search, moving fall-off and tail checks inside the inner character loop dding tail position as a separate variable from offset in the string
|
2015-12-23 19:16:43 -05:00 |
|
Al
|
19173d3a6e
|
[transliteration] In set match checks, use the current index, not current index - char_len
|
2015-12-23 13:12:30 -05:00 |
|
Al
|
e9e05bb929
|
[transliteration] Distinguishing between variables with numbers and backreferences in transliteration rules
|
2015-12-23 13:07:44 -05:00 |
|
Al
|
aaa1fc0387
|
[fix] Stepping through codepoints first then through chars in trie_search_prefixes_from_index (used in transliteration and numex)
|
2015-12-23 01:58:39 -05:00 |
|
Al
|
baa8e3cc3f
|
[fix] Compare the remaining part of the current UTF-8 character using simple string comparison, since it may be in the middle of a valid UTF-8 character
|
2015-12-21 20:34:15 -05:00 |
|
Al
|
ceda863e9f
|
[fix] Encode strings as JSON in address parser cli
|
2015-12-21 17:45:09 -05:00 |
|