Commit Graph

758 Commits

Author SHA1 Message Date
Al
ae7e30634b [features] Adding counter/bag-of-words representation of features 2015-09-08 00:17:26 -07:00
Al
49d389b9d8 [refactor] changing names in int-valued hash tables 2015-09-08 00:15:14 -07:00
Al
2fffd76af8 [fix] typo 2015-09-07 23:58:34 -07:00
Al
aa454c4430 [fix] removing char_array_copy from header 2015-09-07 23:58:05 -07:00
Al
3fd6552b44 [fix] void not void * in vector *_copy 2015-09-07 23:57:44 -07:00
Al
cddffdb65f [math] Adding column and row sums to sparse matrices 2015-09-07 00:34:00 -07:00
Al
8525529968 [osm] Not requiring qualified name tags to process OSM toponyms 2015-09-06 21:03:01 -07:00
Al
9d2ca08fc2 [utils] Adding _copy and _new_copy methods to vectors (the former copies data to a pre-allocated vector, the latter allocates a new vector) 2015-09-06 21:01:26 -07:00
Al
49fe504201 [math] Matrix get value at row, column index 2015-09-06 12:37:10 -07:00
Al
ec3ab7234a [utils] Adding index to cstring_array_foreach, similar to Python's enumerate 2015-09-04 19:34:06 -04:00
Al
df20e2cbc0 [osm] Including toponyms in the training data for countries where the unqualified place names can be assumed to be examples of a given language 2015-09-04 14:13:33 -04:00
Al
17fcfa8b59 [fix] adding house to ignore keys rather than aliasing it 2015-09-04 12:40:08 -04:00
Al
d64a27bc57 [osm] Converting relations to nodes in borders training data 2015-09-04 12:32:25 -04:00
Al
168b7f59da [fix] default indices in strip_component 2015-09-04 12:29:47 -04:00
Al
64db63e3eb [osm] Removing house tag 2015-09-04 12:23:47 -04:00
Al
6a20ce5e85 [language_id] Adding formatted addresses and toponyms to language training data 2015-09-04 01:46:49 -04:00
Al
4ebdca0ea7 [fix] var 2015-09-03 21:01:20 -04:00
Al
8345afbcd0 [fix] exclude country toponyms where the default languages is well represented 2015-09-03 20:56:58 -04:00
Al
20bb191624 [fix] chaining 2015-09-03 20:52:00 -04:00
Al
e7cf5000fe [fix] Exclude polygons with > 1 regional language 2015-09-03 20:48:04 -04:00
Al
9a9530c1b9 [fix] unqualified names 2015-09-03 20:37:22 -04:00
Al
a5fdd911d8 [fix] only use name key for default names 2015-09-03 20:35:08 -04:00
Al
d8e1432533 [osm] Adding unqualified names in single-language countries 2015-09-03 20:31:49 -04:00
Al
d13d4d7d28 [dictionaries] Adding English gazetteers as non-default to Georgia 2015-09-03 20:25:42 -04:00
Al
b15d2d70aa [fix] top language 2015-09-03 20:09:46 -04:00
Al
44bf94a158 [osm] Better borders training data set (only need the metadata, not the polygons) 2015-09-03 20:09:03 -04:00
Al
55af9b0a0c [fix] OSM address tagged training data formatting 2015-09-03 18:35:19 -04:00
Al
c6bfc0e021 [osm] Postponing punctuation stripping until after address template rendering 2015-09-03 18:13:41 -04:00
Al
d54fb25e45 [osm] don't bother with the R-tree check if there are no name:* tags in border data set 2015-09-03 17:54:40 -04:00
Al
33af61095b [fix] var 2015-09-03 17:49:52 -04:00
Al
294101ad80 [osm] Treating components that are all punctuation as blank in address parsing (e.g. a single comma) 2015-09-03 17:46:57 -04:00
Al
e1e5c16637 [osm] Not adding unqualified name tags to toponym data set, throwing out a few cases of language ambiguity 2015-09-03 16:50:30 -04:00
Al
040a26a6f2 [fix] import 2015-09-03 13:54:23 -04:00
Al
7787427c58 [fix] typo 2015-09-03 13:53:18 -04:00
Al
23633e95dd [osm] Only adding country default language toponyms to training data 2015-09-03 13:44:41 -04:00
Al
11c01f64d2 [osm] OrderedDict of attrs in OSM training data 2015-09-03 11:11:18 -04:00
Al
27eb4e4aed [osm] Adding a toponym language training set using planet-borders.osm (all admin borders) 2015-09-03 10:19:11 -04:00
Al
db57855c95 [osm] Switching formatter repo to the OpenVenues fork, with fixes and several dozen new countries added 2015-09-03 10:06:54 -04:00
Al
a916668f28 [i18n] Local file for ISO 15924 2015-09-01 23:58:36 -04:00
Al
ee4d73c65d [math] sparse matrix I/O methods 2015-09-01 00:29:11 -04:00
Al
a8f6617294 [phrases] Adding num_keys attribute to trie 2015-08-31 21:41:34 -04:00
Al
aac5b37e76 [fix] Removing default dirent include 2015-08-31 21:38:29 -04:00
Al
bb50c7ea2c [math] Adding sigmoid and softmax functions 2015-08-31 21:04:21 -04:00
Al
a090a22bca [math] Adding compressed sparse row (CSR) format sparse matrix, designed for dynamic construction, just the methods needed for logistic regression for now i.e. no sparse dot products 2015-08-31 16:42:41 -04:00
Al
0f617454d3 [math] Dense matrices 2015-08-31 14:57:11 -04:00
Al
0ee72b8dfb [math] can only use memset for *_array_new_zeros 2015-08-31 14:44:43 -04:00
Al
c566eaecf1 [dictionaries] Rebuilding address expansion data and uploading new files to S3 2015-08-31 14:33:28 -04:00
Al
789150ae33 [math] Using regular C arrays instead of vectors for vector_math.h 2015-08-30 02:41:31 -04:00
Al
07b0bed602 [math] Only float vectors have *_array_log, *_array_exp, etc. 2015-08-26 17:58:07 -04:00
Al
a2ec8001b0 [osm] Removing postal code keys in formatted language training data 2015-08-24 14:08:36 -04:00