Commit Graph

2283 Commits

Author SHA1 Message Date
Al
30b7a9c285 [dictionaries] Croatian dictionary additions to support new config 2016-07-21 17:04:57 -04:00
Al
6d5a8f1221 [addresses] house_number/level combination and Roman numerals for Dutch config 2016-07-21 17:04:57 -04:00
Al
18d6b8c63a [addresses] field combinations, Roman numerals and spellout for Russian config 2016-07-21 17:04:57 -04:00
Al
36f8b65d16 [addresses] Adding combinations and Roman numeral floor numbers into Serbian config 2016-07-21 17:04:57 -04:00
Al
99c1b633ac [test] Printing invalid phrases in address config tests 2016-07-21 17:04:57 -04:00
Al
c91950ea6c [osm] Adding OSM file for places stored as nodes. Adding a general venue definition accessible from the geodata Python package. OSM definitions expand simple variables so can reuse/combine definitions in the bash script 2016-07-21 17:04:57 -04:00
Al
79e1d7639b [addresses] Serbian address config 2016-07-21 17:04:57 -04:00
Al
631136e98f [dictionaries] Serbian dictionaries to support new address config 2016-07-21 17:04:57 -04:00
Al
a92ceedc95 [numex] Serbian numex 2016-07-21 17:04:57 -04:00
Al
48358376bc [fix] no upper casing for Pinyin and Romanized Korean 2016-07-21 17:04:57 -04:00
Al
b4d5f80b2a [fix] rename zh_pinyin 2016-07-21 17:04:57 -04:00
Al
6f199ce870 [addresses] Adding Latvian address config 2016-07-21 17:04:57 -04:00
Al
80e1bf65e0 [numex] Adding ordinal suffixes for Latvian and Lithuanian 2016-07-21 17:04:57 -04:00
Al
21447d6520 [dictionaries] Latvian dictionary updates to support the new address config 2016-07-21 17:04:57 -04:00
Al
6d0e5359e7 [addresses] Implementing list-based field combinations 2016-07-21 17:04:57 -04:00
Al
eca6fc7de3 [addresses] Implementing whitespace_probability and ordinal_suffix probability for Roman numerals 2016-07-21 17:04:57 -04:00
Al
bbba442311 [addresses] Lithuanian address config 2016-07-21 17:04:57 -04:00
Al
b12aa209f9 [dictionaries] Lithuanian dictionaries to support new address config 2016-07-21 17:04:57 -04:00
Al
60ab307d7a [dictionaries] Adding 'no' form to languages that also use '№' 2016-07-21 17:04:57 -04:00
Al
7f84bb0b5b [addresses] Romanized Korean address config 2016-07-21 17:04:57 -04:00
Al
d60eec8874 [addresses] Korean address config 2016-07-21 17:04:57 -04:00
Al
e0e279afce [dictionaries] Korean dictionaries to support new address config 2016-07-21 17:04:57 -04:00
Al
3de79b11be [dictionaries] Updates to Russian dictionaries 2016-07-21 17:04:57 -04:00
Al
e54f73647f [addresses] Chinese Pinyin config 2016-07-21 17:04:57 -04:00
Al
3f388135eb [addresses] Ukrainian address config 2016-07-21 17:04:57 -04:00
Al
ccfb9b7974 [dictionaries] Ukrainian dictionaries to support new address config 2016-07-21 17:04:57 -04:00
Al
a9dc168ad3 [boundaries] boundary mappings for South Korea 2016-07-21 17:04:57 -04:00
Al
af11db1488 [addresses] Adding digit spellout and the list form of field combinations to existing configs 2016-07-21 17:04:57 -04:00
Al
64f167f045 [tokenization] Re-generating scanner 2016-07-21 17:04:57 -04:00
Al
81b4a4a1cb [tokenization] Hyphens, etc. between non-ASCII digits (e.g. Unicode full-width numbers) should be single tokens 2016-07-21 17:04:57 -04:00
Al
e4d8faab73 [osm] Japanese addresses only use named valid venues, not just anything with a name 2016-07-21 17:04:57 -04:00
Al
068e24a206 [fix] ordinal spellout for numbers which map directly to a simple rule 2016-07-21 17:04:57 -04:00
Al
d6c44a0c09 [fix] alternatives lists in config utils 2016-07-21 17:04:57 -04:00
Al
793671d0b9 [addresses] Sample from higher floors in buildings higher than 10 stories since those are relatively rare and we get enough lower numbered floors from random sampling 2016-07-21 17:04:57 -04:00
Al
47f926c4b6 [addresses] Handling digit rewrites (spellout, Roman numerals, etc.) in the base class 2016-07-21 17:04:57 -04:00
Al
d97b00b4c1 [addresses] Removing temporary file list and allowing any file ending in .yaml in resources/addresses to be parsed/imported 2016-07-21 17:04:57 -04:00
Al
1e79f31649 [fix] components 2016-07-21 17:04:57 -04:00
Al
0e4859b0d7 [addresses] Chinese address config with variations for Hong Kong, Taiwan, etc. 2016-07-21 17:04:57 -04:00
Al
4a86a5e973 [dictionaries] Chinese dictionaries to support new address config 2016-07-21 17:04:57 -04:00
Al
07e62c8d54 [dictionaries] A few more Japanese dictionaries to support the address config 2016-07-21 17:04:57 -04:00
Al
f980038ff6 [dictionaries] código postal also used in some Portuguese-speaking countries e.g. Angola, Mozambique 2016-07-21 17:04:57 -04:00
Al
8629cb1e30 [dictionaries] Abbreviation for apartado in Portugal to match Spanish 2016-07-21 17:04:57 -04:00
Al
738701f734 [dictionaries] Adding railway station token to Japanese qualifiers 2016-07-21 17:04:57 -04:00
Al
eae9890746 [addresses] Japanese Romaji address config 2016-07-21 17:04:57 -04:00
Al
ce7f7600e4 [addresses] Japanese address config 2016-07-21 17:04:57 -04:00
Al
8ab1a0ae11 [dictionaries] Updates to Japanese qualifiers 2016-07-21 17:04:57 -04:00
Al
2d35b89345 [addresses] Using Digits.rewrite in unit generation as well as adding a new config option for generating positive numbers only 2016-07-21 17:04:57 -04:00
Al
bbeb9a14ca [addresses] Using Digits.rewrite for entrance, staircase, floor numbers, and PO boxes 2016-07-21 17:04:57 -04:00
Al
4d0506a295 [addresses] Adding Digits, which allows for replacing numbers with their unicode full-width equivalents or doing number spellout 2016-07-21 17:04:57 -04:00
Al
ed77ceead3 [addresses] Adding some of the new configs and returning None if no phrase alternatives exist 2016-07-21 17:04:57 -04:00