Commit Graph

2216 Commits

Author SHA1 Message Date
Al
24e439fbca [dictionaries] Bostad in Swedish (used in Finland) 2016-07-21 17:04:57 -04:00
Al
f9cc96d879 [addresses] Finnish address config 2016-07-21 17:04:57 -04:00
Al
3862a3172e [dictionaries] Finnish dictionaries to support address config 2016-07-21 17:04:57 -04:00
Al
a0ac1e203b [addresses] Estonian address config 2016-07-21 17:04:57 -04:00
Al
a86c2f491c [dictionaries] Estonian dictionaries to support address config 2016-07-21 17:04:57 -04:00
Al
2a10dd16d5 [fix] Afrikaans expansion 2016-07-21 17:04:57 -04:00
Al
fcfd28f23a [fix] Fixes to address configs 2016-07-21 17:04:57 -04:00
Al
0b119becaf [numex] Estonian ordinal indicators are just . 2016-07-21 17:04:57 -04:00
Al
8023c1e86a [numex] Finnish ordinals can also use . 2016-07-21 17:04:57 -04:00
Al
4412ba1177 [test] Adding tests for address configs 2016-07-21 17:04:57 -04:00
Al
d3a6a032ab [fix] a few errors with non-numbers in numeric_phrase 2016-07-21 17:04:57 -04:00
Al
be5fd79a48 [expansion] Prefix/suffix expansions by default can apply to ADDRESS_ANY but also inherit the types of any dictionary that lists their canonical form (so we can add suffixes without worrying about whether they're for streets or place names, etc.) 2016-07-21 17:04:57 -04:00
Al
8072b01023 [dictionaries] Adding concatenated suffixes to street types, adding universitat as a suffix 2016-07-21 17:04:57 -04:00
Al
96d4c64ebd [addresses] Use bostad in Swedish addresses in Finland 2016-07-21 17:04:57 -04:00
Al
2505afa2b9 [addresses] Adding new configs 2016-07-21 17:04:57 -04:00
Al
dfd29911fd [addresses] Implementing Roman numerals and cardinal/ordinal number spellout in numbering base class 2016-07-21 17:04:57 -04:00
Al
79bc859692 [addresses] Italian address config 2016-07-21 17:04:57 -04:00
Al
46a82aef89 [dictionaries] Italian dictionaries to support sub-building config 2016-07-21 17:04:57 -04:00
Al
579dafc6e0 [addresses] Slovak address config 2016-07-21 17:04:57 -04:00
Al
540e4be7b2 [dictionaries] Slovak dictionaries to support sub-building config 2016-07-21 17:04:57 -04:00
Al
02f19c4df0 [addresses] Czech sub-building config 2016-07-21 17:04:57 -04:00
Al
c16bd2768a [dictionaries] Czech dictionaries to support sub-building config 2016-07-21 17:04:57 -04:00
Al
078aa20930 [numex] ordinal suffixes for Czech/Slovak 2016-07-21 17:04:57 -04:00
Al
6ab8041618 [dictionaries] Ampersand in Polish/Russian 2016-07-21 17:04:57 -04:00
Al
faed055803 [dictionaries] Numero sign in Italian 2016-07-21 17:04:57 -04:00
Al
efa75919e6 [dictionaries] numero sign in French 2016-07-21 17:04:57 -04:00
Al
ee71d94e85 [addresses] Adding Roman numerals to the Polish config for floor numbers 2016-07-21 17:04:57 -04:00
Al
11c6564783 [addresses] Russian address config 2016-07-21 17:04:57 -04:00
Al
7bc459f1a9 [dictionaries] Russian dictionaries to support address configs 2016-07-21 17:04:57 -04:00
Al
53052e6d25 [addresses] Polish address config and dictionary updates 2016-07-21 17:04:57 -04:00
Al
558d643042 [numex] Portuguese ordinals fix 2016-07-21 17:04:57 -04:00
Al
b15675f8cb [addresses/dictionaries] Adding rez-de-chaussée bas and rez-de-chaussée haut in French 2016-07-21 17:04:57 -04:00
Al
d89e9dcd04 [dictionaries] Variations on sin numero for Spanish 2016-07-21 17:04:57 -04:00
Al
ee27dc5ea1 [addresses/dictionaries] Updates to Portuguese configs, variations for Brasil 2016-07-21 17:04:57 -04:00
Al
8a5dd26dbf [numex] Adding method to do cardinal number spellout by hundreds e.g. twenty-three seventeen instead of two thousand three three hundred seventeen 2016-07-21 17:04:57 -04:00
Al
eee68d1ca5 [numex] Ordinal spellout using the numex configs 2016-07-21 17:04:57 -04:00
Al
c628b9bee8 [dictionaries] English cross streets 2016-07-21 17:04:57 -04:00
Al
8383d5bb12 [numex] Adding numeric expression spellout in the Python geodata module for generating training data 2016-07-21 17:04:57 -04:00
Al
53ea1c139a [osm/addresses] using new is_numeric in AddressComponents expansion and removing venue names that are identical to the house number 2016-07-21 17:04:57 -04:00
Al
8926293063 [parser/cli] Using NFC normalization on the output in the parser client (closes #30). Optional command-line arg for parser output dir, useful for spot-checking different experiments 2016-07-21 17:04:57 -04:00
Al
44908ff95a [parser] No digit normalization in training data-derived parser phrases (for postcodes, etc.), phrases include the new island type, house number phrases if any are valid. Adjacent words are now full phrases if they are part of a multiword token like a city name. For hyphenated names like Carmel-by-the-Sea, adding a version to the phrase dictionary where the hyphens are replaced with spaces 2016-07-21 17:04:57 -04:00
Al
41ae742285 [fix] tokenized trie search when falling off the trie at the start of a valid phrase 2016-07-21 17:04:57 -04:00
Al
6e60b3bbda [fix] semicolon in #define 2016-07-21 17:04:57 -04:00
Al
0f76c8c631 [dictionaries] Portuguese abbreviations 2016-07-21 17:04:57 -04:00
Al
b8aba86471 [addresses] Implementing unit types which use concatenated floors with offsets for basement (e.g. Norway) 2016-07-21 17:04:57 -04:00
Al
c29d1ad947 [addresses] Implementing number_min_abs_value, number_max_abs_value outside of number_abs_value constraint 2016-07-21 17:04:57 -04:00
Al
589497cb16 [addresses] Adding Portuguese sub-building config 2016-07-21 17:04:57 -04:00
Al
2be41732f8 [dictionaries] Portuguese dictionaries to support sub-building config 2016-07-21 17:04:57 -04:00
Al
1bd62313f4 [dictionaries] Adding e/ to ambiguous in Spanish dictionaries 2016-07-21 17:04:57 -04:00
Al
6b7e4f8515 [dictionaries] Adding No to Germanic-language number synonyms 2016-07-21 17:04:57 -04:00