Commit Graph

30 Commits

Author SHA1 Message Date
Al
1e295ea8e9 [dictionaries] Making new component for near/nearby prepositions 2016-06-01 15:32:23 -04:00
Al
bac86be6a3 [dictionaries] Adding new dictionary types to generator script 2016-05-28 17:16:43 -04:00
Al
9f8ab2d967 [gazetteers] Street and synonym dictionary for catching other abbreviations that occur in street names 2016-05-22 03:22:05 -04:00
Al
b647842bed [fix] var name for error case 2016-05-21 02:11:00 -04:00
Al
689d163e08 [chains] Adding chains gazetteer 2016-05-20 14:07:50 -04:00
Al
3c750a868e [phrases] Using safe_encode/safe_decode as default trie serializer/deserializer 2016-05-02 15:45:39 -04:00
Al
1c6844f8f3 [fix] six.u 2016-04-28 17:50:25 -04:00
Al
9088ba6df6 [fix] token_types.PHRASE 2016-04-28 17:21:58 -04:00
Al
1266f5e9b5 [gazetteers] moving PHRASE to a token type 2016-04-27 15:11:38 -04:00
Al
1eeda65cfd [dictionaries] /house_number/house_numbers/ 2016-04-20 15:57:12 -04:00
Al
028dbacc87 [dictionaries] making entrances/postcodes plural for consistency 2016-04-15 01:10:03 -04:00
Al
883ef2ec56 [dictionaries] Moving intersections to cross streets 2016-04-14 17:53:45 -04:00
Al
5850793768 [expansion] Add postcode dictionary to gazetteer types 2016-04-14 14:33:02 -04:00
Al
36b3d515ad [expansion] Modifying the Python gazetteers to use new dictionaries API 2016-04-14 14:17:09 -04:00
Al
2ff4940e36 [expansion] Adding number and intersections to dictionary types 2016-04-14 14:15:33 -04:00
Al
49b02796c0 [addresses] Adding abbreviations as a separate module so it can be used with multiple data sets 2016-04-14 03:10:01 -04:00
Al
d38de71854 [dictionaries] encapsulating reading address dictionaries so it's easy to implement sampling for the address training data 2016-04-08 18:12:30 -04:00
Al
998c774405 [fix] removing init_gazetteers, doing it at the module level 2016-03-28 17:42:21 -04:00
Al
3c504414bc [dictionaries] Adding dictionary type enums to the generator script 2016-03-28 17:41:43 -04:00
Al
18e2c7519e [fix] Absolute dir check in generating expansion data files 2016-03-13 23:23:46 -04:00
Al
1003832b9c [fix] README should not be included in building address dictionaries 2016-03-09 11:18:19 -05:00
Al
52ebc9fc46 [fix] Paths relative to the current file in address_dictionaries.py so it can be run from anywhere 2016-02-24 13:10:44 -05:00
Al
b22646ee30 [mv] Moving gazetteers into their own module 2016-01-22 03:15:56 -05:00
Al
35db855819 [fix] canonical index in address expansion data, should be -1 for all canonical phrases 2015-12-08 15:09:51 -05:00
Al
a5ce1f12dd [fix] stdint header in address expansion rule generation script 2015-08-08 23:28:11 -04:00
Al
b27af13f8a [expansion] Adding an array of dictionaries to each (phrase, canonical) pair 2015-07-22 20:24:14 -04:00
Al
64a63fdf51 [mv] Moving all repo data files to a resources dir, data is only for runtime files 2015-07-21 18:11:36 -04:00
Al
7f67ed7dc0 [fix] less ambiguous variable name in the generated expansions data file 2015-07-20 02:58:26 -04:00
Al
b9103a39fa [expansion] Moving filename=>dictionary type mapping to the Python generation script and validating there 2015-07-16 03:51:11 -04:00
Al
f181c04e7a [expansion] expansion rule structs and Python script to generate rules from dictionaries tree. Note that a canonical_index of -1 indicates that a given phrase is the canonical (saves space) 2015-07-16 02:49:53 -04:00