Commit Graph

2268 Commits

Author SHA1 Message Date
Al
eca6fc7de3 [addresses] Implementing whitespace_probability and ordinal_suffix probability for Roman numerals 2016-07-21 17:04:57 -04:00
Al
bbba442311 [addresses] Lithuanian address config 2016-07-21 17:04:57 -04:00
Al
b12aa209f9 [dictionaries] Lithuanian dictionaries to support new address config 2016-07-21 17:04:57 -04:00
Al
60ab307d7a [dictionaries] Adding 'no' form to languages that also use '№' 2016-07-21 17:04:57 -04:00
Al
7f84bb0b5b [addresses] Romanized Korean address config 2016-07-21 17:04:57 -04:00
Al
d60eec8874 [addresses] Korean address config 2016-07-21 17:04:57 -04:00
Al
e0e279afce [dictionaries] Korean dictionaries to support new address config 2016-07-21 17:04:57 -04:00
Al
3de79b11be [dictionaries] Updates to Russian dictionaries 2016-07-21 17:04:57 -04:00
Al
e54f73647f [addresses] Chinese Pinyin config 2016-07-21 17:04:57 -04:00
Al
3f388135eb [addresses] Ukrainian address config 2016-07-21 17:04:57 -04:00
Al
ccfb9b7974 [dictionaries] Ukrainian dictionaries to support new address config 2016-07-21 17:04:57 -04:00
Al
a9dc168ad3 [boundaries] boundary mappings for South Korea 2016-07-21 17:04:57 -04:00
Al
af11db1488 [addresses] Adding digit spellout and the list form of field combinations to existing configs 2016-07-21 17:04:57 -04:00
Al
64f167f045 [tokenization] Re-generating scanner 2016-07-21 17:04:57 -04:00
Al
81b4a4a1cb [tokenization] Hyphens, etc. between non-ASCII digits (e.g. Unicode full-width numbers) should be single tokens 2016-07-21 17:04:57 -04:00
Al
e4d8faab73 [osm] Japanese addresses only use named valid venues, not just anything with a name 2016-07-21 17:04:57 -04:00
Al
068e24a206 [fix] ordinal spellout for numbers which map directly to a simple rule 2016-07-21 17:04:57 -04:00
Al
d6c44a0c09 [fix] alternatives lists in config utils 2016-07-21 17:04:57 -04:00
Al
793671d0b9 [addresses] Sample from higher floors in buildings higher than 10 stories since those are relatively rare and we get enough lower numbered floors from random sampling 2016-07-21 17:04:57 -04:00
Al
47f926c4b6 [addresses] Handling digit rewrites (spellout, Roman numerals, etc.) in the base class 2016-07-21 17:04:57 -04:00
Al
d97b00b4c1 [addresses] Removing temporary file list and allowing any file ending in .yaml in resources/addresses to be parsed/imported 2016-07-21 17:04:57 -04:00
Al
1e79f31649 [fix] components 2016-07-21 17:04:57 -04:00
Al
0e4859b0d7 [addresses] Chinese address config with variations for Hong Kong, Taiwan, etc. 2016-07-21 17:04:57 -04:00
Al
4a86a5e973 [dictionaries] Chinese dictionaries to support new address config 2016-07-21 17:04:57 -04:00
Al
07e62c8d54 [dictionaries] A few more Japanese dictionaries to support the address config 2016-07-21 17:04:57 -04:00
Al
f980038ff6 [dictionaries] código postal also used in some Portuguese-speaking countries e.g. Angola, Mozambique 2016-07-21 17:04:57 -04:00
Al
8629cb1e30 [dictionaries] Abbreviation for apartado in Portugal to match Spanish 2016-07-21 17:04:57 -04:00
Al
738701f734 [dictionaries] Adding railway station token to Japanese qualifiers 2016-07-21 17:04:57 -04:00
Al
eae9890746 [addresses] Japanese Romaji address config 2016-07-21 17:04:57 -04:00
Al
ce7f7600e4 [addresses] Japanese address config 2016-07-21 17:04:57 -04:00
Al
8ab1a0ae11 [dictionaries] Updates to Japanese qualifiers 2016-07-21 17:04:57 -04:00
Al
2d35b89345 [addresses] Using Digits.rewrite in unit generation as well as adding a new config option for generating positive numbers only 2016-07-21 17:04:57 -04:00
Al
bbeb9a14ca [addresses] Using Digits.rewrite for entrance, staircase, floor numbers, and PO boxes 2016-07-21 17:04:57 -04:00
Al
4d0506a295 [addresses] Adding Digits, which allows for replacing numbers with their unicode full-width equivalents or doing number spellout 2016-07-21 17:04:57 -04:00
Al
ed77ceead3 [addresses] Adding some of the new configs and returning None if no phrase alternatives exist 2016-07-21 17:04:57 -04:00
Al
2d2e2489ff [addresses] Fixes for standalone components, conditional adds, and allowing generated unit numbers to use known floor number 2016-07-21 17:04:57 -04:00
Al
225f0d0906 [dictionaries] gebouw as concatenated inseparable suffix in Dutch (helps with identifying unknown words) 2016-07-21 17:04:57 -04:00
Al
5a046bcd03 [addresses] New structure for blocks (placeholder, not implemented as random phrases yet) 2016-07-21 17:04:57 -04:00
Al
1f1fab1a71 [addresses/dictionaries] Netherlands config update, moving verdiep abbreviation to Belgian Flemish 2016-07-21 17:04:57 -04:00
Al
1083ddbf12 [dictionaries] Dutch abbreviations for standalone levels like begane grond 2016-07-21 17:04:57 -04:00
Al
fe2bb06ac2 [osm] Since most streets in Japan do not have names, define a separate set of valid address constraints and merge the files into planet-addresses.osm 2016-07-21 17:04:57 -04:00
Al
a3f26618c0 [addresses] Making house number phrase (e.g. Calle Foobar nº 2) slightly more common in Spanish-speaking world (and even more likely in Colombia) 2016-07-21 17:04:57 -04:00
Al
426206a8dc [addresses] Hungarian address config 2016-07-21 17:04:57 -04:00
Al
a746759a72 [numex] adding '.' for Hungarian ordinal indicator (Roman numerals handled in address config) 2016-07-21 17:04:57 -04:00
Al
cded1baca9 [dictionaries] Hungarian dictionaries to support address config 2016-07-21 17:04:57 -04:00
Al
9efc2d4d79 [addresses] Adding ability to determine unit numbers using a known floor number 2016-07-21 17:04:57 -04:00
Al
6fc18b9adb [addresses] Roman numerals can be returned by Floor.random, relaxing the Zipfian distribution on floors so we get higher floors 2016-07-21 17:04:57 -04:00
Al
0b87a5387d [dictionaries] Adding another level type for Russian 2016-07-21 17:04:57 -04:00
Al
57c823088d [dictionaries] A few more Spanish additions 2016-07-21 17:04:57 -04:00
Al
4ce3d76074 [dictionaries] & to Swedish cross streets 2016-07-21 17:04:57 -04:00