Commit Graph

4314 Commits

Author SHA1 Message Date
Al
82b26117aa [fix] name comparison in neighborhoods index 2016-12-18 15:27:21 -05:00
Al
3ac2c93e1c [utils] using renaming char_array_append_vjoined to char_array_add_vjoined to follow convention that add_* calls NUL-terminate while append_* calls do not 2016-12-18 15:26:58 -05:00
Al
8322e98ad3 [fix] var name II 2016-12-18 11:42:16 -05:00
Al
0c55bc3bb8 [fix] var name 2016-12-18 11:41:00 -05:00
Al
e5657c5612 [fix] putting the neighborhoods check after the dupe threshold check, as it's not really needed until then anyway 2016-12-18 03:00:40 -05:00
Al
4314a6822d [fix] don't need to do two checks for OSM boundaries 2016-12-18 02:32:05 -05:00
Al
590246748f [fix] move OSM check to after ClickThatHood/Quattroshapes checks as we don't need to check the point if it doesn't match a neighborhood geometry. Should speed up neighborhood index construction 2016-12-18 02:27:50 -05:00
Al
0a1e69ee9b [fix] yaml config 2016-12-18 01:52:40 -05:00
Al
86a8315b9d [openaddresses] adding new config option to OA config for aliasing fields based on a regex 2016-12-18 01:50:58 -05:00
Al
d357f0f37c [neighborhoods] check polygon boundaries in OSM neighborhood points for a name match at the city level or below 2016-12-18 01:46:44 -05:00
Al
a2cf1a35df [openaddresses] aliasing Paris/Marseilles/Lyon arrondissements to city_district in OpenAddresses 2016-12-18 01:28:58 -05:00
Al
fc57c437cb [boundaries] adding exceptions for Arrondissements in Paris, Marseilles and Lyon 2016-12-18 01:19:55 -05:00
Al
154a227285 [openaddresses] 5-digit postcodes for Spain, some are stored as integers stripping the initial zeros 2016-12-17 17:40:49 -05:00
Al
726ee2a299 [openaddresses] fixing state abbreviations for Mexico 2016-12-17 02:54:46 -05:00
Al
3ed95a175e [ngrams] adding function to extract an array of ngrams from a string, with optional special prefixes/suffixes for the edges 2016-12-17 01:33:18 -05:00
Al
3c6ed7489c [openaddresses] adding regex replacement to remove "*" from any field 2016-12-16 17:09:41 -05:00
Al
f1a460b874 [openaddresses] adding state abbreviations for OA Switzerland 2016-12-16 15:56:42 -05:00
Al
10d4979f21 [states] adding Canton abbreviations for Switzerland 2016-12-16 15:54:08 -05:00
Al
e99d76e750 [places] higher probability of adding Canton (state) for smaller cities in Switzerland 2016-12-16 15:53:42 -05:00
Al
05adbaca01 [places] add state_district (province) and state (region) in Italy more often 2016-12-16 14:49:15 -05:00
Al
ba96f68b62 [fix] openaddresses formatter 2016-12-16 14:22:15 -05:00
Al
d08e8d8dd3 [openaddresses] adding a value map for Italian province abbreviations in the countrywide file (they're commonly used in addresses and this may be a better place to handle that since the province names are given). Updating OpenAddresses config to use new dictionary field maps. 2016-12-16 06:57:05 -05:00
Al
da3240d5f6 [openaddresses] making field maps in OpenAddresses config a dictionary rather than a list to make inheritance easier 2016-12-16 06:54:36 -05:00
Al
83aab5a46a [openaddresses] adding option to map values for a particular field 2016-12-16 06:44:19 -05:00
Al
ae32645e0d [openaddresses] add city and state to Mexico City 2016-12-14 20:49:40 -05:00
Al
558cd2af2d [boundaries] adding a few more US non-city_districts as exceptions. 2016-12-14 18:14:12 -05:00
Al
846b88cde5 [addresses] let the place config take care of adding/removing neighborhoods rather than doing it as part of the add_neighborhoods method 2016-12-14 03:15:07 -05:00
Al
5946ead37f [addresses] using the defined component from the neighborhoods index for city_district (they're fairly rare, just NYC boroughs basically) 2016-12-14 03:10:07 -05:00
Al
026737cd3b [neighborhoods] adding component to neighborhoods index at construction time 2016-12-14 03:07:13 -05:00
Al
5846943b70 [addresses] removing place_type override requirement from the neighborhoods index (NYC boroughs, etc.) 2016-12-14 02:16:57 -05:00
Al
09f808ca47 [geoplanet] only add short postal codes to GeoPlanet data set if they match the Google regexes 2016-12-13 17:03:26 -05:00
Al
34db27b80c [openaddresses] Mendocino County, CA 2016-12-13 16:44:22 -05:00
Al
6b04711195 [neighborhoods] adjust cache size when building neighborhoods index 2016-12-13 16:11:42 -05:00
Al
40cd86c3be [addresses] only add city relacement if a city is not found first 2016-12-13 16:10:52 -05:00
Al
7e65661884 [openaddresses] Pierce County, WA 2016-12-13 14:03:16 -05:00
Al
cd91068f0f [neighborhoods] fix neighborhoods index checks to include the borough points while still not making letting something like Santa Monica pass as a neighborhoods when it's a proper city 2016-12-13 02:30:24 -05:00
Al
cb475d8245 [openaddresses] adding Sunshine Coast, BC and Sardegna, Italy 2016-12-12 17:42:47 -05:00
Al Barrentine
bcf6b3cc68 Merge pull request #137 from openvenues/fix_address_parser_train
Fix address_parser_train
2016-12-12 11:54:16 -05:00
Al
8f1e69960f [fix] loading transliteration module in address_parser_test.c as well 2016-12-12 11:37:27 -05:00
Al
3939dd0ca6 [fix] cstring_array_split calls 2016-12-12 11:37:27 -05:00
Al
a42d0e917a [fix] brace 2016-12-12 11:37:27 -05:00
Al
ced8f9ae27 [parser] Ignore multiple spaces in parser input post-normalization. If normalizing the string creates several distinct tokens (namely in Vulgar fractions e.g. ½ => 1/2), add all the sub-tokens with the same label as the parent 2016-12-12 11:37:27 -05:00
Al
b1816e9b70 [utils] Adding cstring_array_split_ignore_consecutive 2016-12-12 11:37:27 -05:00
Al
6baa7087fe [fix] calls and NULL checks 2016-12-12 11:37:27 -05:00
Al
5e07f5e8c5 [fix] tokenized_string_t should copy its source string 2016-12-12 11:37:27 -05:00
Al
521a094a47 [fix] Need to load transliteration module for Latin-ASCII normalization 2016-12-12 11:37:27 -05:00
Al
d158751d92 [addresses] same rules for state_district apply to state, no alt_names etc. unless a city is present 2016-12-12 05:31:32 -05:00
Al
bf3e9749ca [osm] during place formatting, add point-based cities for any places/polygons that are smaller than cities e.g. suburb or city_district, use admin_center as the point for reverse geocoding if available (instead of representative_point() which can be expensive or centroid which can be inaccurate) 2016-12-12 05:29:39 -05:00
Al
33dd9223dc [places] allowing state_district to depend on state in the US 2016-12-11 17:04:24 -05:00
Al
5d98f3115c [boundareis] adding two exceptions for admin_level=9 in US 2016-12-11 16:58:16 -05:00