Commit Graph

1762 Commits

Author SHA1 Message Date
Al
371198da3c [fix] typo 2016-12-10 18:14:11 -05:00
Al
91982528c6 [fix] normalize place names after adding admin boundaries as well 2016-12-10 18:07:41 -05:00
Al
34d3ae7e9e [addresses] fixing normalized_place_name so it deals with things like Washington DC where Washington DC may actually be one of the OSM names 2016-12-10 17:52:38 -05:00
Al
80ee34cc3a [text] adding normalization with whitespace 2016-12-10 17:50:53 -05:00
Al
4550f00f03 [fix] var name 2016-12-10 15:18:09 -05:00
Al
72771741c3 [fix] order 2016-12-10 15:16:35 -05:00
Al
8595d8da05 [addresses] don't add components to the trie that have the same normalized name as the given component 2016-12-10 15:12:40 -05:00
Al
bb12d0940e [fix] options/docs in osm address training 2016-12-10 13:45:37 -05:00
Al
ffc584f679 [states] adding all forms of the state abbreviation to the trie when doing place name normalization to handle the D.C./DC case 2016-12-10 13:45:22 -05:00
Al
5098599ed6 [addresses] remove Quattroshapes/GeoNames cities as they may have problematic names, and in any case we have point-based cities from OSM now 2016-12-10 02:08:40 -05:00
Al
18c5fd0855 [fix] check for non-None city 2016-12-10 01:23:06 -05:00
Al
dc022f8652 [osm] adding normalized_place_name to Quattroshapes city 2016-12-10 01:20:40 -05:00
Al
c7b1818695 [fix] imports 2016-12-09 19:53:17 -05:00
Al
973466bb13 [states] adding multiple state abbreviations for states that can have periods in the naem like D.C., D.F. in Mexico and Brasil, etc. 2016-12-09 19:48:59 -05:00
Al
675552d254 [addresses] using normalized tokens when stripping off compound place names for things like D.C. 2016-12-09 17:52:57 -05:00
Al
c0a468d7e8 [normalization] adding a normalize_token function and some token options for deleting periods 2016-12-09 17:46:26 -05:00
Al
8f30987bdf [fix] checking if building is a rail station 2016-12-09 02:57:47 -05:00
Al
e92963de50 [openaddresses] adding new counties from OpenAddresses, strip commas option for thousands separators 2016-12-09 01:57:21 -05:00
Al
b60b7c9009 [geoplanet] adding an index of state_districts, states, etc. that contain a city with an identical name. Alias to the city if it's the only contained place, otherwise don't allow the admin name without the city. 2016-12-08 17:00:29 -05:00
Al
640f70c05d [geoplanet] all_places table, specified dirs 2016-12-08 02:50:08 -05:00
Al
f9945103ba [addresses] if suburb/city_district is already listed, and we're finding the closest city by point rather than by boundary, use the closest actual city, not something smaller like a village/hamlet 2016-12-08 02:39:27 -05:00
Al
28d9ef12c0 [geoplanet] fixing geoplanet aliases insert warning 2016-12-08 02:31:10 -05:00
Al
763c86dcd4 [geoplanet] add County to the names of US counties outside of Louisiana and Alaska, add Parish in Lousiana 2016-12-08 02:30:37 -05:00
Al
7436d9693a [names] adding new name_affixes call to replace both prefixes/suffixes in one call, using in GeoPlanet training and the generic AddressComponents normalizations 2016-12-07 05:49:16 -05:00
Al
9386a999f6 [names] adding country-specific affixes and only normalizing the word City as a suffix in UK/Ireland 2016-12-07 05:37:25 -05:00
Al
3ff472c8cf [openaddresses] fixing house numbers with multiple consecutive hyphens 2016-12-06 22:50:14 -05:00
Al
e13787a6f6 [fix] var name again 2016-12-05 18:49:23 -05:00
Al
e1c6eff5e2 [fix] var 2016-12-05 18:46:49 -05:00
Al
da36b71829 [addresses] adding new places index in OSM and OpenAddresses training data 2016-12-05 18:36:17 -05:00
Al
628fecea59 [addresses] adding point-based city/equivalent reverse geocoding for places that don't have as many defined polygons in OSM 2016-12-05 18:30:46 -05:00
Al
f87f0df717 [places] adding generic place index for reverse geocoding to points 2016-12-05 02:05:54 -05:00
Al
e32c232c67 [localities] /planet-neighborhoods/planet-localities/ 2016-12-04 23:05:11 -05:00
Al
cca80b046c [abbreviation] fixing abbreviations within hyphenated phrases, particularly for prefix/suffix matches 2016-12-03 17:55:11 -05:00
Al
adab232674 [osm] don't include rail stations with no venue phrases (if there's a railway station at Foo, only include it if it's named "Foo Station", not just plain "Foo") 2016-12-01 02:03:38 -05:00
Al
ef243fbb18 [fix] var name 2016-11-25 13:41:07 -08:00
Al
cdbc102821 [boundaries] in addition to population, check if a city has an unambiguous Wikipedia 2016-11-25 13:36:49 -08:00
Al
87634a36e1 [openaddresses] for cases where city populations are not known (i.e. not getting boundaries from OSM, most of the sources in OpenAddresses), place-only records should have at least two identifying components. Helps when city names, etc. are highly ambiguous and need to be qualified 2016-11-25 00:56:38 -08:00
Al
5c3ccc3bc6 [places] better handling of population exceptions in places config 2016-11-25 00:38:49 -08:00
Al
e07c74f077 [fix] config 2016-11-24 03:57:52 -05:00
Al
46b7043dc7 [fix] typo 2016-11-24 03:50:11 -05:00
Al
fcf4717335 [openaddresses] adding city_replacements handling to OA formatter 2016-11-23 20:16:48 -05:00
Al
3dc2a922fb [addresses/languages] if there's only one default language and we don't have a road name or a unicode script to disambiguate, assume the default (e.g. English in the US unless there's a Spanish/French road name). Can affect things like state abbreviations 2016-11-22 18:27:54 -05:00
Al
ee6edbbd91 [countries] take first encountered country code instead of reversing the components (for cases like Puerto Rico, Hong Kong, etc.) 2016-11-22 11:55:41 -05:00
Al
ee8c070fd5 [osm] override admin_level with other components in config if present 2016-11-22 11:22:26 -05:00
Al
aa1f4fdd20 [places] adding section called city_replacements to places config, for countries where something like the state_district/county, suburb or city_district should stand in for the city when one cannot be reverse geocoded (unincorporated county addresses, etc.) 2016-11-22 09:51:04 -05:00
Al
480796f46f [osm] trying representative_point() on the unfixed polygons to capture some cases where the geometry still needs to be fixed before it's valid 2016-11-22 01:28:02 -05:00
Al
a596d03309 [fix] return values 2016-11-19 12:45:39 -05:00
Al
e15036fcce [fix] if there are street types that are not venue words and not vice versa, then call the venue invalid as a standalone term 2016-11-19 04:11:33 -05:00
Al
8e905fd17d [fix] if no venue names are passed in to formatted_addresses_with_venue_names, remove any existing venue name from the components as well 2016-11-19 03:46:16 -05:00
Al
e6fe576ec7 [fix] var 2016-11-19 03:15:23 -05:00