Commit Graph

4708 Commits

Author SHA1 Message Date
Al
64e62cac32 [openaddresses] adding Bogotá, Colombia 2017-02-18 10:13:31 -08:00
Al
4f128579d6 [openaddresses] adding Commerce City, CO and creating an alias for the simple unit regex for reuse 2017-02-17 14:07:00 -05:00
Al
b88487f633 [utils] string_replace_char does single byte/character replacement, new string_replace to do full string replacement, again using char_array for safety, string_replace_with_array function for memory reuse 2017-02-17 13:58:51 -05:00
Al
da856ea5c3 [parser] adding phrase features for category, unit, level, entrance, staircase, and po_box phrases from the libpostal dictionaries, excluding phrases which match the toponyms dictionary (e.g. US states that can also be found in street/venue names, useful for expansion but not here), if the current token is part of both an address dictionary phrase and a component phrase derived from the training data, use the longer of the two, or both if they are the same length 2017-02-17 03:00:48 -05:00
Al
5b616dfb57 [addresses] allowing neighborhood components to be passed in 2017-02-17 02:11:56 -05:00
Al
e7d8577ad7 [openaddresses] add city of San Luis Obispo 2017-02-16 16:00:23 -05:00
Al
d6281648dc [openaddresses] add Cumberland County, NC 2017-02-16 14:49:00 -05:00
Al
1631c25ad0 [openaddresses] add city of O'Fallon, IL 2017-02-16 14:48:40 -05:00
Al
4c4147f465 [openaddresses] add city of Scotsdale, AZ 2017-02-16 14:48:17 -05:00
Al
df76cde1e7 [openaddresses] adding Pickens County, SC 2017-02-16 03:34:49 -05:00
Al
c380b3e91b [parser] phrase search with address dictionaries should not use the language given at training time since it's not currently available at runtime (without pulling in the language classifier, which may be warranted at some point, especially if the model can be made smaller/sparser) 2017-02-15 22:32:30 -05:00
Al
a3e51db32d [api] include some of the new components in default address_components for the libpostal expansion API 2017-02-15 22:29:22 -05:00
Al
32fb483e96 [gazetteers] adding ADDRESS_PO_BOX component 2017-02-15 22:23:28 -05:00
Al
ba0ccc82a3 [fix] var name in address_parser_train 2017-02-15 22:22:33 -05:00
Al
0196fe8736 [utils] fixing key_type in hash_get, adding int64_double map 2017-02-15 22:20:36 -05:00
Al
be6f48f109 [fix] that didn't work, set log level to CRITICAL 2017-02-15 14:06:57 -05:00
Al
26bf617a06 [fix] prevent Shapely from logging to console 2017-02-15 14:00:51 -05:00
Al
a0b508caf6 [transliteration] adding no-args option for transliteration_rules script 2017-02-15 13:22:33 -05:00
Al
8abfa766fd [fix] paren 2017-02-15 02:26:18 -05:00
Al
06003dfbb0 [fix] lower probability of name:prefix 2017-02-14 18:57:31 -05:00
Al
92b34f6af4 [fix] var name 2017-02-14 18:53:53 -05:00
Al
ca79342636 [fix] config 2017-02-14 18:50:51 -05:00
Al
8eafc5730b [parser] adding long-context features which help classify the first token in the string by finding the relative positions of a) the first numeric token and b) the first street-level phrase like "Ave" or "Calle" 2017-02-14 18:42:51 -05:00
Al
08976c772e [neighborhoods] base parser config changes for new prefix/first_match options 2017-02-14 18:19:15 -05:00
Al
64673c2875 [neighborhoods] add neighborhoods that are not the top match occasionally 2017-02-14 18:17:48 -05:00
Al
b99e31ca17 [neighborhoods] add name:prefix in admin boundaries and neighborhoods (used often in e.g. Germany), use alternative/language keys as well 2017-02-14 18:07:13 -05:00
Al
614479aee1 [neighborhoods] don't add point with same name as existing OSM polygon 2017-02-14 16:40:08 -05:00
Al
854ac853a9 [fix] OSM neighborhoods index check 2017-02-14 03:53:22 -05:00
Al
a416a314fa [fix] var 2017-02-14 03:38:53 -05:00
Al
56f68e4399 [phrases] fixing trie suffix search 2017-02-14 03:36:29 -05:00
Al
5bbc0e15d7 [fix] poly.context 2017-02-14 03:09:02 -05:00
Al
1ee0f1fe0d [osm] mapping admin_level=11 to suburb in Germany, admin_level=10 to suburb in Berlin 2017-02-14 02:34:47 -05:00
Al
b9c24867d7 [fix] set component to suburb on OSM neighborhoods index 2017-02-14 02:19:53 -05:00
Al
072a838fde [neighborhoods] components are now pre-calculated by CTH index 2017-02-14 02:04:07 -05:00
Al
c91a0bdb91 [fix] rm 2017-02-14 01:56:24 -05:00
Al
949c10ab22 [fix] remove print 2017-02-14 01:53:54 -05:00
Al
738bd7b525 [neighborhoods] logging, moving OSM/CTH before Quattroshapes for easier testing 2017-02-14 01:52:59 -05:00
Al
67f69ce6ce [fix] move 2017-02-14 01:51:23 -05:00
Al
6d580f4c87 [osm] neighborhood polygon reader 2017-02-14 01:50:04 -05:00
Al
6c68d446a0 [neighborhoods] adding ClickThatHood config to whitelist/specify what kind of polygon is specified in each file. Adding OSM neighborhoods (ways/relations where place=neighbourhood to reduce ambiguity) as the highest priority, followed by CTH/OSM, CTH, Quattro/OSM, Quattro 2017-02-14 01:48:43 -05:00
Al
2003e08623 [osm] creating an OSM neighborhood boundaries data set for place=neighbourhood polygons only (place=suburb, etc. can be ambiguous) 2017-02-13 20:45:54 -05:00
Al
eff9280224 [boundaries] Amsterdam and Rotterdam listed as admin_level=10 in OSM, making exceptions 2017-02-13 16:06:18 -05:00
Al
2f4bcaeec2 [parser] address_parser_test memory cleanup, add print-errors option to print individual parser errors on held-out data 2017-02-12 16:05:11 -05:00
Al
b1e178b7b2 [fix] is_numeric_token includes IDEOGRAPHIC_NUMBER 2017-02-12 15:11:56 -05:00
Al
b2978f49ba [openaddresses] adding Newaygo County, MI and Scioto County, OH 2017-02-12 00:37:15 -05:00
Al
e569956944 [osm] remove postcode field if more than one is found 2017-02-11 03:52:46 -05:00
Al
9af4b1bd42 [openaddresses] fixing street requirement 2017-02-11 03:29:09 -05:00
Al
2dff6c8839 [fix] call 2017-02-11 02:16:55 -05:00
Al
081f023d60 [fix] name 2017-02-11 02:10:59 -05:00
Al
6705ebaffd [fix] import 2017-02-11 02:09:31 -05:00