Commit Graph

3458 Commits

Author SHA1 Message Date
Al
afbb79b81d [osm/parser] Making a much lower probability of generating sub-building components for named venues (usually on the ground floor, etc.) 2016-07-31 20:40:44 -04:00
Al
b727078be5 [fix] use alphanumeric in generated component configs by default 2016-07-31 20:39:22 -04:00
Al
2e92c6fcc8 [fix] Probabilities for Ukrainian house numbers 2016-07-31 20:01:42 -04:00
Al
0f3c4276b4 [fix] args 2016-07-31 19:53:39 -04:00
Al
0827caf578 [fix] sample=true 2016-07-31 19:51:03 -04:00
Al
3871869d4b [osm] Check that OSM venue names contain at least one word-like token 2016-07-31 19:50:45 -04:00
Al
ce17b50064 [fix] canonical probability 2016-07-31 19:16:46 -04:00
Al
0bdcae252f [fix] building tag updates 2016-07-31 18:43:55 -04:00
Al
3a19506121 [fix] containing ids 2016-07-31 18:30:58 -04:00
Al
d04a627e92 [fix] KeyError 2016-07-31 18:29:29 -04:00
Al
92b8566930 [places] Increase probability of state and decrease probability of county for smaller ciites/towns 2016-07-31 03:26:34 -04:00
Al
3f450054f9 [fix] numeric conditions in place config 2016-07-31 03:15:43 -04:00
Al
99333d58ca [fix] conditions in place config 2016-07-31 03:09:51 -04:00
Al
cec4914233 [openaddresses] In some OpenAddresses data sets, the house number is just a copy of the street name, so eliminate non-numeric house numbers to be safe 2016-07-31 01:12:04 -04:00
Al
f8e9d39e12 [places] Implementing population-based place components in both place and address component expansion 2016-07-30 19:15:03 -04:00
Al
bb91a5b0f0 [places] For the US, add state_district (county) with higher probability for towns with higher populations. Helps with cases that would be difficult to get right otherwise like Brooklyn, Cattaraugus County, NY (http://www.openstreetmap.org/node/158644800) 2016-07-30 18:57:28 -04:00
Al
ebaef4d671 [places] Implementation of population-based exceptions for adding OSM boundary components 2016-07-30 18:52:55 -04:00
Al
20aad99a38 [parser] enum just lists boundary types 2016-07-30 17:07:23 -04:00
Al
965bac1833 [trie] Making methods to construct string phrases from phrase matches available through trie_search.h 2016-07-30 17:06:20 -04:00
Al
469332ffc4 [osm/polygons] Reducing cache_size to 250k now that the polygons are larger 2016-07-30 16:44:59 -04:00
Al
5bfc29d3f6 [osm/places] Using num_references / 2 for non-default languages and min_references / 2 for alternate name tags 2016-07-30 12:46:54 -04:00
Al
3d20bd13c3 [osm] Add population to reverse geocoder properties 2016-07-30 12:25:39 -04:00
Al
a45ff88f5f [osm/polygons] Don't simplify OSM polygons, might have memory 2016-07-29 12:53:13 -04:00
Al
f8c8d05997 [fix] same thing for the exception countries 2016-07-29 12:47:08 -04:00
Al
045eab8e58 [osm] Making ISO codes lower probability for reverse geocoded country as well 2016-07-29 12:30:32 -04:00
Al
09b16d954f [osm] Use much lower probability of ISO country codes 2016-07-29 11:41:39 -04:00
Al
9dc52ea3c4 [osm] Add more English + non-local language names for places in OSM 2016-07-29 10:31:26 -04:00
Al
ed0b867c13 [osm] For formatting places from the polygon index, use centroid if representative_point fails 2016-07-29 07:13:41 -04:00
Al
f38bb151e2 [fix] var name 2016-07-28 23:53:55 -04:00
Al
08f39d6b80 [parser] Adding address_parser_rewind to make multiple passes through the file when compiling the phrase tries 2016-07-28 17:13:58 -04:00
Al
1b09b7f2e5 [fix] Adding country_region to address_parser_train 2016-07-28 16:18:32 -04:00
Al
21bcbd8381 [fix] restoring CLDR probability 2016-07-28 15:21:44 -04:00
Al
c6af5cc071 [parser] Adding country_region label to parser as a boundary component 2016-07-28 15:19:48 -04:00
Al
854e6d901f [osm] Add CLDR country before dropout 2016-07-28 14:41:14 -04:00
Al
bebb33fe64 [osm] Include CLDR country even if the place didn't match simplified OSM polygons 2016-07-28 14:11:31 -04:00
Al
ea1226082e [fix] wrong instance 2016-07-28 02:56:17 -04:00
Al
fc118acd90 [fix] language None for ambiguous case 2016-07-28 02:48:45 -04:00
Al
db51cc91c2 [fix] property 2016-07-28 02:41:26 -04:00
Al
543048bc26 [osm] use CLDR country names with random probability 2016-07-28 02:37:12 -04:00
Al
095c808cea [places] increasing country probabilities, state probabilities in Mexico and Brasil 2016-07-28 02:26:51 -04:00
Al
d276611b9c [fix] poly.context 2016-07-28 01:46:12 -04:00
Al
88353b75e0 [fix] more helpful error message if there are errors with the formatting config 2016-07-27 19:14:30 -04:00
Al
21033537a2 [fix] US insertion config 2016-07-27 19:13:59 -04:00
Al
a4a74aec7f [osm] Updating formatting config for all the languages/countries currently implemented 2016-07-27 17:45:18 -04:00
Al
f8d185aaff [osm/formatting] Tag commas in a given labeld component with the SEP tag so e.g. concatenated districts can be counted as separate phrases 2016-07-27 16:13:57 -04:00
Al
750037330e [boundaries] Updated boundaries for Slovakia to capture city districts, etc. 2016-07-27 14:07:36 -04:00
Al
4cc49b7ca4 [fix] typo 2016-07-27 12:48:35 -04:00
Al
9e61b9409f [osm] For componens at or below the city level that are the admin_center of their smallest containing boundary with the same name, use the boundary's component name instead of the point's 2016-07-27 12:46:43 -04:00
Al
d9b70d3404 [fix] mapping the nodes for NYC boroughs to city_district 2016-07-27 12:22:50 -04:00
Al
ad4da98bd7 [fix] lowercase language code 2016-07-27 11:51:17 -04:00