309 Commits

Author SHA1 Message Date
Al
48755ec218 [boundaries] Adding regex replacements for boundary names such as Lyon 2e Arrondissement where putting Lyon is the OSM convention but we might sometimes want just 2e Arrondissement to appear in the training data next to Lyon 2016-08-11 13:09:24 -04:00
Al
5ec752e887 [fix] order of ops 2016-08-06 20:43:13 -04:00
Al
3e34012e69 [fix] if the language is given already, use it as a suffix rather than choosing at random 2016-08-06 20:36:56 -04:00
Al
606c464db6 [fix] house number phrases 2016-08-06 20:11:32 -04:00
Al
0e7cb2b06c [fix] var name II 2016-08-06 20:00:35 -04:00
Al
8d88820d30 [fix] var name 2016-08-06 19:59:53 -04:00
Al
6ef54bcc6f [addresses] Adding metro stations to AddressComponents expansion 2016-08-06 19:36:57 -04:00
Al
d59ab82701 [metro stations] Adding metro station phrase generator 2016-08-06 19:33:21 -04:00
Al
684550ea7d [fix] only add house_number phrase to numeric inputs 2016-08-06 14:49:28 -04:00
Al
445e8082c8 [addresses] Adding per-country overrides for address component dependencies 2016-08-06 02:36:47 -04:00
Al
0ab3b13b75 [osm] Remove hanging commas, slashes, etc. Implementing a stricter rule for user-specified tags (not reverse geocoded) so that if they contain an unknown phrase followed by an unknown boundary phrase, we delete that tag and fall back to the reverse geocoded components. Moving CLDR country tagging to later in the process since those are known correct names. 2016-08-02 16:25:45 -04:00
Al
4ab60cd4fc [osm] Remove boundary names with trailing commas 2016-08-02 03:13:05 -04:00
Al
12466b12dc [osm] Removing boundary names (not including postal codes) which are simply digits 2016-08-02 02:17:25 -04:00
Al
818bd50105 [fix] unit phrase should return None if there's no config available for a particular zone type (again enforcing the idea that venues typically don't have sub-building information) 2016-08-01 18:29:32 -04:00
Al
e11c723f8b [fix] var rename 2016-08-01 17:50:00 -04:00
Al
79ce922432 [osm] Fixing sub-building components so generated numbers are not added to the address components unless cls.phrase returns non-None 2016-08-01 17:44:23 -04:00
Al
3505af4bc1 [fix] don't add phrases for non-numeric existing components 2016-07-31 22:14:37 -04:00
Al
d3e50fc894 [fix] NULL-phrase first ordering 2016-07-31 22:10:25 -04:00
Al
b727078be5 [fix] use alphanumeric in generated component configs by default 2016-07-31 20:39:22 -04:00
Al
3a19506121 [fix] containing ids 2016-07-31 18:30:58 -04:00
Al
d04a627e92 [fix] KeyError 2016-07-31 18:29:29 -04:00
Al
f8e9d39e12 [places] Implementing population-based place components in both place and address component expansion 2016-07-30 19:15:03 -04:00
Al
3f4c18ddb6 [fix] None case for names 2016-07-27 01:16:05 -04:00
Al
4e14926169 [osm] choosing random name for semicolons and first name for commas in OSM name components 2016-07-27 01:06:14 -04:00
Al
024d47a8a5 [osm] Adding admin_center handling to OSM address components 2016-07-25 02:14:51 -04:00
Al
776145cf8e [osm] Adding new option to control whether we drop non-city OSM boundary names that have the same name as the enclosed city 2016-07-25 01:24:13 -04:00
Al
696448981c [fix] var name 2016-07-24 21:58:56 -04:00
Al
bfb89adaab [osm] use containing ids in component mapping 2016-07-24 21:57:04 -04:00
Al
46f83ce3ef [addresses] Implementing phrases for numbered blocks 2016-07-21 17:04:57 -04:00
Al
cc280c7001 [addresses] Implementing alphabet_probability, so may still use the Latin alphabet in some cases 2016-07-21 17:04:57 -04:00
Al
6d0e5359e7 [addresses] Implementing list-based field combinations 2016-07-21 17:04:57 -04:00
Al
eca6fc7de3 [addresses] Implementing whitespace_probability and ordinal_suffix probability for Roman numerals 2016-07-21 17:04:57 -04:00
Al
793671d0b9 [addresses] Sample from higher floors in buildings higher than 10 stories since those are relatively rare and we get enough lower numbered floors from random sampling 2016-07-21 17:04:57 -04:00
Al
47f926c4b6 [addresses] Handling digit rewrites (spellout, Roman numerals, etc.) in the base class 2016-07-21 17:04:57 -04:00
Al
d97b00b4c1 [addresses] Removing temporary file list and allowing any file ending in .yaml in resources/addresses to be parsed/imported 2016-07-21 17:04:57 -04:00
Al
1e79f31649 [fix] components 2016-07-21 17:04:57 -04:00
Al
2d35b89345 [addresses] Using Digits.rewrite in unit generation as well as adding a new config option for generating positive numbers only 2016-07-21 17:04:57 -04:00
Al
bbeb9a14ca [addresses] Using Digits.rewrite for entrance, staircase, floor numbers, and PO boxes 2016-07-21 17:04:57 -04:00
Al
4d0506a295 [addresses] Adding Digits, which allows for replacing numbers with their unicode full-width equivalents or doing number spellout 2016-07-21 17:04:57 -04:00
Al
ed77ceead3 [addresses] Adding some of the new configs and returning None if no phrase alternatives exist 2016-07-21 17:04:57 -04:00
Al
2d2e2489ff [addresses] Fixes for standalone components, conditional adds, and allowing generated unit numbers to use known floor number 2016-07-21 17:04:57 -04:00
Al
9efc2d4d79 [addresses] Adding ability to determine unit numbers using a known floor number 2016-07-21 17:04:57 -04:00
Al
6fc18b9adb [addresses] Roman numerals can be returned by Floor.random, relaxing the Zipfian distribution on floors so we get higher floors 2016-07-21 17:04:57 -04:00
Al
d3a6a032ab [fix] a few errors with non-numbers in numeric_phrase 2016-07-21 17:04:57 -04:00
Al
2505afa2b9 [addresses] Adding new configs 2016-07-21 17:04:57 -04:00
Al
dfd29911fd [addresses] Implementing Roman numerals and cardinal/ordinal number spellout in numbering base class 2016-07-21 17:04:57 -04:00
Al
11c6564783 [addresses] Russian address config 2016-07-21 17:04:57 -04:00
Al
ee27dc5ea1 [addresses/dictionaries] Updates to Portuguese configs, variations for Brasil 2016-07-21 17:04:57 -04:00
Al
53ea1c139a [osm/addresses] using new is_numeric in AddressComponents expansion and removing venue names that are identical to the house number 2016-07-21 17:04:57 -04:00
Al
b8aba86471 [addresses] Implementing unit types which use concatenated floors with offsets for basement (e.g. Norway) 2016-07-21 17:04:57 -04:00