Commit Graph

806 Commits

Author SHA1 Message Date
Al
134cf616d6 [osm] Using street for language disambiguation in training data 2015-09-21 04:09:15 -04:00
Al
ccac4a5a90 [fix] package directory 2015-09-21 04:01:36 -04:00
Al
5f912ddcd3 [fix] std=c99 2015-09-21 03:25:32 -04:00
Al
5b2fd0be50 [fix] pytokenize compilation on Ubuntu/gcc 2015-09-21 03:24:14 -04:00
Al
cffa5a4a20 [fix] stdint include 2015-09-20 20:10:47 -04:00
Al
25b3338600 [setup] setup.py for pypostal so it can be installed from the Github url 2015-09-20 20:07:59 -04:00
Al
84cf21df88 [osm] Separating address formatter into its own module, adding some documentation of the various training sets with examples 2015-09-20 20:05:46 -04:00
Al
5485ea2197 [python] Adding initial pypostal bindings for tokenize so we can remove address_normalizer dependency. Not tested on Python 3. 2015-09-20 14:59:39 -04:00
Al
3fab0f984f [fix] fixing some compiler warnings, using type-specific abs functions for vector_math 2015-09-19 16:11:09 -04:00
Al
6731395ca0 [osm] Separating tagged from untagged output 2015-09-19 14:11:47 -04:00
Al
2940cc15b8 [fix] tokenized string destroy frees original string 2015-09-19 01:40:41 -04:00
Al
2b13871341 [constants] max country code length 2015-09-19 01:39:58 -04:00
Al
0396823772 [fix] geodb path separator 2015-09-19 01:39:31 -04:00
Al
17cfdb0625 [fix] adding char_array_append_* methods to header 2015-09-18 13:19:42 -04:00
Al
f2f7db92ff [fix] phrases 2015-09-18 13:19:18 -04:00
Al
b74e92adad [fix] include 2015-09-18 13:18:49 -04:00
Al
2a869894d9 [fix] geodb 2015-09-18 13:18:26 -04:00
Al
9e9131bda0 [parser] Averaged perceptron tagger 2015-09-17 05:51:24 -04:00
Al
8a86f7ec64 [parser] Adding context struct to feature function 2015-09-17 05:48:00 -04:00
Al
87ed7d9a0f [geodb] Adding trie search methods for finding geodb phrases 2015-09-16 22:11:10 -04:00
Al
e62c75b9c6 [phrases] Adding _with_phrases versions of address dictionary methods for pre-allocated phrases 2015-09-16 21:24:28 -04:00
Al
23103a21d4 [phrases] Adding with_phrases versions of trie search methods for pre-allocated phrases 2015-09-16 21:23:34 -04:00
Al
d5ec005787 [transliteration] Similar init method for transliteration 2015-09-16 21:14:02 -04:00
Al
b11362ab98 [numex] using module init method for building, otherwise passing NULL path uses the default path 2015-09-16 21:13:05 -04:00
Al
3cba2e8df3 [api] Using default setup methods for submodules in libpostal setup 2015-09-15 14:01:33 -04:00
Al
e122824448 [expansion] Adding the ability to search address dictionary phrases with a NULL language, will return phrases in any language 2015-09-15 14:00:26 -04:00
Al
c47ff1b113 [utils] Adding source string to tokenized_string struct 2015-09-15 13:21:51 -04:00
Al
b2f690b6f6 [api] Error logging if modules can't be found 2015-09-15 13:21:15 -04:00
Al
9de3029dd3 [parser] Averaged perceptron training does full examples (greedily). During training, features are a hashtable, sorted and converted to a trie during finalize 2015-09-14 17:38:45 -04:00
Al
a5b5f80b04 [fix] new_copy 2015-09-14 16:50:23 -04:00
Al
3ea6358f77 [fix] vector zeros allocation 2015-09-14 16:50:08 -04:00
Al
c21f61b9b4 [parser] Default address parser path 2015-09-11 15:05:38 -07:00
Al
32c180528f [tokens] Adding a copy_tokens option for tokenized_string 2015-09-11 15:04:29 -07:00
Al
9ce658b7a3 [collections] Adding string_array for an array of char pointers 2015-09-10 16:34:16 -07:00
Al
35b9122a1a [utils] inlining a few functions 2015-09-10 16:33:54 -07:00
Al
35f1c02caf [polygons] Reducing simplify tolerance for language polys now that regional languages are handled separately 2015-09-10 12:44:13 -07:00
Al
440a8158b6 [polygons] Adding in country languages for regional polygons without a default language 2015-09-10 12:34:26 -07:00
Al
22c16b43cf [languages] Italian is also the regional default in Valle D'Aosta and Trentino-Alto Adige 2015-09-10 11:09:13 -07:00
Al
fca7f21b1d [polygons] Making simplify_tolerance and preserve_topology for polygon simplification configurable per class 2015-09-10 11:06:18 -07:00
Al
6a5b01b51b [parser] Averaged perceptron training 2015-09-10 10:26:24 -07:00
Al
0ddf50cb5f [utils] add to feature array with printf syntax 2015-09-10 10:24:51 -07:00
Al
b3f89a207a [utils] Version of string_split for single character delimiters which modifies the input string directly rather than creating (essentially) a copy 2015-09-09 18:07:31 -07:00
Al
c1da2fa94b [dictionaries] Adding 'Rang' to French dictionaries 2015-09-09 17:21:26 -07:00
Al
b85fe50fad [osm] Training data for toponyms only cares about valid languages for name field 2015-09-08 16:38:05 -07:00
Al
607a607b71 [doc] documentation fix for averaged perceptron 2015-09-08 16:37:23 -07:00
Al
c80d8b8067 [parsing] Averaged perceptron model data structure for storing the finalized, averaged, sparse weights 2015-09-08 12:42:54 -07:00
Al
8d642b45b9 [fix] trie was returning early on add_at_index and not incrementing the num_keys 2015-09-08 11:41:46 -07:00
Al
e566063343 [osm] Doing an all-to-nodes conversion and an additional filter on the borders data set 2015-09-08 09:18:08 -07:00
Al
ae7e30634b [features] Adding counter/bag-of-words representation of features 2015-09-08 00:17:26 -07:00
Al
49d389b9d8 [refactor] changing names in int-valued hash tables 2015-09-08 00:15:14 -07:00