6388a79bf0[addresses] strip "-", etc. in addr:housenumber
Al
2016-12-21 01:53:23 -05:00
c33db4f04d[addresses] normalize existing sub-building components
Al
2016-12-21 01:28:43 -05:00
3b14613f1d[fix] restore original house number for subsequent formatting after addr:conscriptionnumber/addr:streetnumber
Al
2016-12-21 00:51:44 -05:00
484c7ef912[osm] adding addresses with addr:conscriptionnumber and addr:streetnumber when available
Al
2016-12-21 00:36:40 -05:00
eafafab959[addresses] adding function to generate phrases for addr:conscriptionnumber in OSM, e.g. č.p. 123 in the Czech Republic
Al
2016-12-21 00:35:39 -05:00
63006a0c8b[dictionaries] adding súpisné číslo (s.č.) in Slovak
Al
2016-12-20 21:38:32 -05:00
010db088ae[dictionaries] adding Konskriptionsnummer for some addresses in Austria and Germany
Al
2016-12-20 21:37:42 -05:00
f7aebdc2ed[dictionaries] adding číslo popisné (č.p.) in Czech
Al
2016-12-20 21:36:42 -05:00
cc4098fb05[openaddresses] abbreviate states as well in OpenAddresses when full version is specified
Al
2016-12-20 17:24:06 -05:00
1cba89a99b[addresses] higher state abbreviation probability for places that use abbreviations
Al
2016-12-20 16:53:59 -05:00
8845609962[openaddresses] same for the rest of the multiword abbreviated states (except bilingual multiword provinces in Canada where we'll stick to the most common abbreviated form, which gets expanded to the unabbreviated province)
Al
2016-12-20 16:53:27 -05:00
c3db5eb1e0[openaddresses] add full state name for Distrito Federal so all the abbreviations get considered
Al
2016-12-20 16:46:14 -05:00
21202869b8[openaddresses] adding Grafschaft Bentheim, Germany and Tirol, Austria
Al
2016-12-20 12:33:21 -05:00
cd25ca1537[names] replace name affixes with both country/language and language-only variants
Al
2016-12-20 03:10:13 -05:00
9e44fcb2bb[addresses] abbreviating neighborhoods/city_districts
Al
2016-12-20 03:01:34 -05:00
53723bbf3d[fix] passing argument through to normalized_place_name
Al
2016-12-20 02:21:36 -05:00
7ff290e14c[openaddresses] adding Gatineau QC, Owensboro KY, Madison KY, St Clair MI, and Grand Forks ND
Al
2016-12-20 02:17:59 -05:00
2ab584ac0b[states] adding more multiword state abbreviations
Al
2016-12-20 02:16:42 -05:00
11444ffa34[places] adding higher probability of city_district in Mexico (for boroughs of Mexico City)
Al
2016-12-20 01:43:21 -05:00
6d02fbb9b8[addresses] switch for phrases that come from components so they only get stripped if they contain another phrase a la Washington, D.C. Consolidating always_use_full_names and random_key options
Al
2016-12-20 01:42:40 -05:00
e35636ed77[boundaries] higher probability for city_district in the UK (London)
Al
2016-12-19 02:34:06 -05:00
56ca37d1f3[fix] openaddresses config reading
Al
2016-12-19 02:18:24 -05:00
f2720db2f8[osm] adding simple street name normalization for certain streets in OSM that also contain the house number (only when separated by commas and in a country/language where house number comes after street). There are other cases for normalization but need to better define them.
Al
2016-12-19 02:13:39 -05:00
ff32321425[formatter] adding house_number_before_road method to AddressFormatter
Al
2016-12-19 02:00:06 -05:00
f35fd97735[boundaries] add abbreviated state names to valid component names
Al
2016-12-19 00:51:05 -05:00
c3dfd6530f[openaddresses] adding Skagit County, WA, USA
Al
2016-12-19 00:15:06 -05:00
d02a18a5a8[fix] all_names, use values instead of name keys
Al
2016-12-18 17:29:15 -05:00
e9c7bc43e3[fix] check fixed list of keys in all_names as well
Al
2016-12-18 17:26:43 -05:00
2727572822[addresses] using the name key disttribution in AddressComponents.all_names. Returning names and valid components from the new function instead of the full gazetteer (can be build later)
Al
2016-12-18 17:22:13 -05:00
954b6548bf[names] adding name_key_dist method to boundary names to account for certain boundaries like e.g. Kings County that have name exceptions
Al
2016-12-18 17:20:03 -05:00
d308473686[addresses] separating boundary phrase gazetteer construction into its own method
Al
2016-12-18 15:47:16 -05:00
585b203a4f[fix] /props/attrs/
Al
2016-12-18 15:32:09 -05:00
82b26117aa[fix] name comparison in neighborhoods index
Al
2016-12-18 15:27:21 -05:00
3ac2c93e1c[utils] using renaming char_array_append_vjoined to char_array_add_vjoined to follow convention that add_* calls NUL-terminate while append_* calls do not
Al
2016-12-18 15:26:58 -05:00
8322e98ad3[fix] var name II
Al
2016-12-18 11:42:16 -05:00
0c55bc3bb8[fix] var name
Al
2016-12-18 11:41:00 -05:00
e5657c5612[fix] putting the neighborhoods check after the dupe threshold check, as it's not really needed until then anyway
Al
2016-12-18 03:00:40 -05:00
4314a6822d[fix] don't need to do two checks for OSM boundaries
Al
2016-12-18 02:32:05 -05:00
590246748f[fix] move OSM check to after ClickThatHood/Quattroshapes checks as we don't need to check the point if it doesn't match a neighborhood geometry. Should speed up neighborhood index construction
Al
2016-12-18 02:27:50 -05:00
0a1e69ee9b[fix] yaml config
Al
2016-12-18 01:52:38 -05:00
86a8315b9d[openaddresses] adding new config option to OA config for aliasing fields based on a regex
Al
2016-12-18 01:50:58 -05:00
d357f0f37c[neighborhoods] check polygon boundaries in OSM neighborhood points for a name match at the city level or below
Al
2016-12-18 01:42:34 -05:00
a2cf1a35df[openaddresses] aliasing Paris/Marseilles/Lyon arrondissements to city_district in OpenAddresses
Al
2016-12-18 01:28:58 -05:00
fc57c437cb[boundaries] adding exceptions for Arrondissements in Paris, Marseilles and Lyon
Al
2016-12-18 01:19:55 -05:00
154a227285[openaddresses] 5-digit postcodes for Spain, some are stored as integers stripping the initial zeros
Al
2016-12-17 17:40:49 -05:00
726ee2a299[openaddresses] fixing state abbreviations for Mexico
Al
2016-12-17 02:54:42 -05:00
3ed95a175e[ngrams] adding function to extract an array of ngrams from a string, with optional special prefixes/suffixes for the edges
Al
2016-12-17 01:33:18 -05:00
3c6ed7489c[openaddresses] adding regex replacement to remove "*" from any field
Al
2016-12-16 17:09:41 -05:00
f1a460b874[openaddresses] adding state abbreviations for OA Switzerland
Al
2016-12-16 15:56:42 -05:00
10d4979f21[states] adding Canton abbreviations for Switzerland
Al
2016-12-16 15:54:08 -05:00
e99d76e750[places] higher probability of adding Canton (state) for smaller cities in Switzerland
Al
2016-12-16 15:53:42 -05:00
05adbaca01[places] add state_district (province) and state (region) in Italy more often
Al
2016-12-16 14:49:15 -05:00
ba96f68b62[fix] openaddresses formatter
Al
2016-12-16 14:22:15 -05:00
d08e8d8dd3[openaddresses] adding a value map for Italian province abbreviations in the countrywide file (they're commonly used in addresses and this may be a better place to handle that since the province names are given). Updating OpenAddresses config to use new dictionary field maps.
Al
2016-12-16 06:57:05 -05:00
da3240d5f6[openaddresses] making field maps in OpenAddresses config a dictionary rather than a list to make inheritance easier
Al
2016-12-16 06:54:36 -05:00
83aab5a46a[openaddresses] adding option to map values for a particular field
Al
2016-12-16 06:44:19 -05:00
ae32645e0d[openaddresses] add city and state to Mexico City
Al
2016-12-14 20:49:40 -05:00
558cd2af2d[boundaries] adding a few more US non-city_districts as exceptions.
Al
2016-12-14 17:53:10 -05:00
846b88cde5[addresses] let the place config take care of adding/removing neighborhoods rather than doing it as part of the add_neighborhoods method
Al
2016-12-14 03:15:07 -05:00
5946ead37f[addresses] using the defined component from the neighborhoods index for city_district (they're fairly rare, just NYC boroughs basically)
Al
2016-12-14 03:10:02 -05:00
026737cd3b[neighborhoods] adding component to neighborhoods index at construction time
Al
2016-12-14 03:07:13 -05:00
5846943b70[addresses] removing place_type override requirement from the neighborhoods index (NYC boroughs, etc.)
Al
2016-12-14 02:16:57 -05:00
09f808ca47[geoplanet] only add short postal codes to GeoPlanet data set if they match the Google regexes
Al
2016-12-13 17:03:26 -05:00
34db27b80c[openaddresses] Mendocino County, CA
Al
2016-12-13 16:44:22 -05:00
6b04711195[neighborhoods] adjust cache size when building neighborhoods index
Al
2016-12-13 16:11:42 -05:00
40cd86c3be[addresses] only add city relacement if a city is not found first
Al
2016-12-13 16:10:52 -05:00
7e65661884[openaddresses] Pierce County, WA
Al
2016-12-13 14:03:16 -05:00
cd91068f0f[neighborhoods] fix neighborhoods index checks to include the borough points while still not making letting something like Santa Monica pass as a neighborhoods when it's a proper city
Al
2016-12-13 02:28:59 -05:00
cb475d8245[openaddresses] adding Sunshine Coast, BC and Sardegna, Italy
Al
2016-12-12 17:42:43 -05:00
bcf6b3cc68Merge pull request #137 from openvenues/fix_address_parser_train
Al Barrentine
2016-12-12 11:54:16 -05:00
8f1e69960f[fix] loading transliteration module in address_parser_test.c as well
Al
2016-05-25 19:54:01 -04:00
3939dd0ca6[fix] cstring_array_split calls
Al
2016-05-25 17:58:30 -04:00
a42d0e917a[fix] brace
Al
2016-05-25 17:52:00 -04:00
ced8f9ae27[parser] Ignore multiple spaces in parser input post-normalization. If normalizing the string creates several distinct tokens (namely in Vulgar fractions e.g. ½ => 1/2), add all the sub-tokens with the same label as the parent
Al
2016-05-25 17:50:29 -04:00
b1816e9b70[utils] Adding cstring_array_split_ignore_consecutive
Al
2016-05-25 17:07:20 -04:00
6baa7087fe[fix] calls and NULL checks
Al
2016-05-25 15:50:53 -04:00
5e07f5e8c5[fix] tokenized_string_t should copy its source string
Al
2016-05-25 15:47:57 -04:00
521a094a47[fix] Need to load transliteration module for Latin-ASCII normalization
Al
2016-05-25 15:25:34 -04:00
d158751d92[addresses] same rules for state_district apply to state, no alt_names etc. unless a city is present
Al
2016-12-12 05:31:32 -05:00
bf3e9749ca[osm] during place formatting, add point-based cities for any places/polygons that are smaller than cities e.g. suburb or city_district, use admin_center as the point for reverse geocoding if available (instead of representative_point() which can be expensive or centroid which can be inaccurate)
Al
2016-12-12 05:29:33 -05:00
33dd9223dc[places] allowing state_district to depend on state in the US
Al
2016-12-11 17:04:24 -05:00
5d98f3115c[boundareis] adding two exceptions for admin_level=9 in US
Al
2016-12-11 16:58:16 -05:00
da4fe37fb4[addresses] option to add city points, no random keys for state_district if city or replacement is not present
Al
2016-12-11 15:20:20 -05:00
dfc88a47b2[fix] typo
Al
2016-12-11 02:46:03 -05:00
e8abf44c16[neighborhoods] check if there's no defined place-type before classifying a polygon as city_district
Al
2016-12-11 02:44:02 -05:00
01d6bc27b6[fix] "District of" is only a valid prefix in the non-US Anglophone world
Al
2016-12-11 02:11:51 -05:00
9b95601e42[states] adding abbreviations with internal periods for multi-word US states
Al
2016-12-11 01:17:27 -05:00
fffc81a17a[fix] default value
Al
2016-12-10 18:14:25 -05:00
371198da3c[fix] typo
Al
2016-12-10 18:14:11 -05:00
91982528c6[fix] normalize place names after adding admin boundaries as well
Al
2016-12-10 18:07:41 -05:00
34d3ae7e9e[addresses] fixing normalized_place_name so it deals with things like Washington DC where Washington DC may actually be one of the OSM names
Al
2016-12-10 17:52:38 -05:00
80ee34cc3a[text] adding normalization with whitespace
Al
2016-12-10 17:50:53 -05:00
4550f00f03[fix] var name
Al
2016-12-10 15:18:09 -05:00
72771741c3[fix] order
Al
2016-12-10 15:16:35 -05:00
8595d8da05[addresses] don't add components to the trie that have the same normalized name as the given component
Al
2016-12-10 15:12:40 -05:00
bb12d0940e[fix] options/docs in osm address training
Al
2016-12-10 13:45:37 -05:00
ffc584f679[states] adding all forms of the state abbreviation to the trie when doing place name normalization to handle the D.C./DC case
Al
2016-12-10 13:45:22 -05:00
5098599ed6[addresses] remove Quattroshapes/GeoNames cities as they may have problematic names, and in any case we have point-based cities from OSM now
Al
2016-12-10 02:08:33 -05:00
18c5fd0855[fix] check for non-None city
Al
2016-12-10 01:23:06 -05:00
dc022f8652[osm] adding normalized_place_name to Quattroshapes city
Al
2016-12-10 01:17:38 -05:00