Commit Graph

1746 Commits

Author SHA1 Message Date
Al
29fc198aba [osm] giving parse_osm_number_range a parameter for max range and setting it to 1000 for postal codes e.g. for major cities that may have several hundred postal codes 2016-08-15 10:34:24 -04:00
Al
637baad629 [osm] Adding at least min_references entries for every selected postcode 2016-08-15 10:30:28 -04:00
Al
aa6b9cd858 [fix] var name for place tags coming from the admin rtree 2016-08-15 10:25:19 -04:00
Al
bc8acb196c [osm] Pulling valid postal codes out into a method 2016-08-13 01:49:26 -04:00
Al
22123b80ba [fix] refactoring geonames script a bit 2016-08-11 21:31:39 -04:00
Al
48755ec218 [boundaries] Adding regex replacements for boundary names such as Lyon 2e Arrondissement where putting Lyon is the OSM convention but we might sometimes want just 2e Arrondissement to appear in the training data next to Lyon 2016-08-11 13:09:24 -04:00
Al
b993e9a163 [fix] add Japanese-language variant if metro station is added 2016-08-06 21:17:14 -04:00
Al
39bd562d04 [addresses] only set language if we needed it for Japanese house_numbers 2016-08-06 21:06:01 -04:00
Al
5ec752e887 [fix] order of ops 2016-08-06 20:43:13 -04:00
Al
e68fee7c68 [fix] null check 2016-08-06 20:39:28 -04:00
Al
3e34012e69 [fix] if the language is given already, use it as a suffix rather than choosing at random 2016-08-06 20:36:56 -04:00
Al
606c464db6 [fix] house number phrases 2016-08-06 20:11:32 -04:00
Al
e35649f09d [fix] import 2016-08-06 20:01:38 -04:00
Al
0e7cb2b06c [fix] var name II 2016-08-06 20:00:35 -04:00
Al
8d88820d30 [fix] var name 2016-08-06 19:59:53 -04:00
Al
374c46ada5 [fix] metro station properties 2016-08-06 19:56:13 -04:00
Al
0edfbe0d61 [osm] Adding metro stations index to training data options 2016-08-06 19:52:21 -04:00
Al
195278cfea [osm] Reverse geocoding to metro station only for addresess in Japan 2016-08-06 19:50:18 -04:00
Al
6ef54bcc6f [addresses] Adding metro stations to AddressComponents expansion 2016-08-06 19:36:57 -04:00
Al
d59ab82701 [metro stations] Adding metro station phrase generator 2016-08-06 19:33:21 -04:00
Al
1e27ad1124 [metro stations] Adding metro station component to address formatter 2016-08-06 19:13:20 -04:00
Al
5cff119d25 [fix] command line arg 2016-08-06 18:36:27 -04:00
Al
406666362c [fix] command-line index creation 2016-08-06 18:36:01 -04:00
Al
7ddd553129 [fix] metro stations reverse geocoder 2016-08-06 18:30:54 -04:00
Al
5e44f6954b [metro stations] Adding metro stations reverse geocoder 2016-08-06 18:24:25 -04:00
Al
954bb08a8d [points] Fixes to point index 2016-08-06 18:23:30 -04:00
Al
964728a02d [fix] block phrases for Japanese and namespaced language handling in case Romaji is chosen before normalization 2016-08-06 14:50:39 -04:00
Al
684550ea7d [fix] only add house_number phrase to numeric inputs 2016-08-06 14:49:28 -04:00
Al
445e8082c8 [addresses] Adding per-country overrides for address component dependencies 2016-08-06 02:36:47 -04:00
Al
13718355cc [test] Test zones in address configs 2016-08-04 17:52:19 -04:00
Al
eb4c957b4c [test] Adding tests for known number of floors as it touches different parts of the address configs 2016-08-03 17:40:48 -04:00
Al
813f29f299 [osm] Removing the call to normalize_place_names in place data formatting as we should be able to trust the places more than the addresses 2016-08-02 16:29:34 -04:00
Al
0ab3b13b75 [osm] Remove hanging commas, slashes, etc. Implementing a stricter rule for user-specified tags (not reverse geocoded) so that if they contain an unknown phrase followed by an unknown boundary phrase, we delete that tag and fall back to the reverse geocoded components. Moving CLDR country tagging to later in the process since those are known correct names. 2016-08-02 16:25:45 -04:00
Al
97a2436ad7 [tokenization] Adding two more sets to token_types for punctuation and non-alphanumerics 2016-08-02 16:24:01 -04:00
Al
c40ad99ec7 [osm] removing postcode phrase from place training data and adding CLDR countries only after all the other normalizations 2016-08-02 14:52:12 -04:00
Al
5117fb21d3 [fix] access 2016-08-02 03:20:42 -04:00
Al
bd780d3424 [fix] typo 2016-08-02 03:19:22 -04:00
Al
c74d883344 [fix] unindent 2016-08-02 03:17:42 -04:00
Al
f29d043544 [places] Using all of the ideas that apply to places from address formatting for the places-only data set 2016-08-02 03:16:08 -04:00
Al
4ab60cd4fc [osm] Remove boundary names with trailing commas 2016-08-02 03:13:05 -04:00
Al
12466b12dc [osm] Removing boundary names (not including postal codes) which are simply digits 2016-08-02 02:17:25 -04:00
Al
a1f0c1a3c9 [fix] import 2016-08-02 01:50:17 -04:00
Al
818bd50105 [fix] unit phrase should return None if there's no config available for a particular zone type (again enforcing the idea that venues typically don't have sub-building information) 2016-08-01 18:29:32 -04:00
Al
e11c723f8b [fix] var rename 2016-08-01 17:50:00 -04:00
Al
79ce922432 [osm] Fixing sub-building components so generated numbers are not added to the address components unless cls.phrase returns non-None 2016-08-01 17:44:23 -04:00
Al
4c8b662648 [fix] block numbers 2016-08-01 14:36:28 -04:00
Al
1fb8185b75 [osm/boundaries] Allowing OSM entities to map to NULL 2016-08-01 00:52:58 -04:00
Al
2faffc81e7 [fix] import 2016-08-01 00:06:47 -04:00
Al
973ac42a97 [test] Checking probability distributions as part of the address config tests 2016-07-31 22:29:21 -04:00
Al
3505af4bc1 [fix] don't add phrases for non-numeric existing components 2016-07-31 22:14:37 -04:00