Al
|
8383d5bb12
|
[numex] Adding numeric expression spellout in the Python geodata module for generating training data
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
53ea1c139a
|
[osm/addresses] using new is_numeric in AddressComponents expansion and removing venue names that are identical to the house number
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
8926293063
|
[parser/cli] Using NFC normalization on the output in the parser client (closes #30). Optional command-line arg for parser output dir, useful for spot-checking different experiments
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
44908ff95a
|
[parser] No digit normalization in training data-derived parser phrases (for postcodes, etc.), phrases include the new island type, house number phrases if any are valid. Adjacent words are now full phrases if they are part of a multiword token like a city name. For hyphenated names like Carmel-by-the-Sea, adding a version to the phrase dictionary where the hyphens are replaced with spaces
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
41ae742285
|
[fix] tokenized trie search when falling off the trie at the start of a valid phrase
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
6e60b3bbda
|
[fix] semicolon in #define
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
0f76c8c631
|
[dictionaries] Portuguese abbreviations
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
b8aba86471
|
[addresses] Implementing unit types which use concatenated floors with offsets for basement (e.g. Norway)
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
c29d1ad947
|
[addresses] Implementing number_min_abs_value, number_max_abs_value outside of number_abs_value constraint
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
589497cb16
|
[addresses] Adding Portuguese sub-building config
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
2be41732f8
|
[dictionaries] Portuguese dictionaries to support sub-building config
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
1bd62313f4
|
[dictionaries] Adding e/ to ambiguous in Spanish dictionaries
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
6b7e4f8515
|
[dictionaries] Adding No to Germanic-language number synonyms
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
619127e4b1
|
[fix] adding back staircase in Swedish sub-building config
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
bc70a54b09
|
[addresses] Swedish address config
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
b622315d0f
|
[addresses] Lower probability of null phrase in Norwegian configs
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
ac22f270bb
|
[dictionaries] Swedish dictionaries to support sub-building config
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
d8ddae362f
|
[addresses] venstre in Norway rather than igjen
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
cd9b33983a
|
[addresses] Adding parterre for ground floor in Switzerland
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
a61d9b1548
|
[dictionaries] adding phrases meaning 'near' or 'in' for Norwegian to the dictionaries
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
541fe6c5ac
|
[dictionaries] no standalone level types for Norway
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
06fdf1c532
|
[fix] /underetasje/hovedetasje/ in Norwegian and translating category phrases from Danish
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
0222049b88
|
[addresses] Danish level/unit and entrance/unit combinations
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
03b9825390
|
[addresses/units] Adding special handling for floor phrase + unit concatenation in the unit field (handles bruksenhetsnummer/bolignummer-style addresses in Norway)
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
9d7239d0ad
|
[addresses] Adding null-phrase/null-phrase-alpha-only handling and zero padding to numbered components in sub-building configs
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
420b169d48
|
[addresses] adding nb.yaml to valid configs
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
d50495f609
|
[addresses] null_phrase_alpha_only for phrases like 3o B in Spain
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
52db502929
|
[addresses] Norwegian address configs
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
2831b70747
|
[dictionaries] Norwegian sub-building dictionaries
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
b5d4dd6f37
|
[tokenization] Including full-width numbers in numeric tokens
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
02d40c23a6
|
[numex] Norwegian ordinal indicators
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
0136c88629
|
[addresses] Updates to Danish sub-building config
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
5834f6b8ed
|
[dictionaries] Updates to Danish sub-building dictionaries
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
23736f2650
|
[fix] return None if there are no ordinal suffixes for a given language
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
a6da72a831
|
[fix] addr:place=
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
ca88ff7f73
|
[osm] Adding railway stations to venues/addresses data sets
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
b22d30cb52
|
[addresses] Adding Danish config to parsed configs
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
003c95f9eb
|
[formatting] Adding Danish config to formatter and adjusting continental European template insertions
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
b8ae1ad61d
|
[addresses] Danish address config
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
6f5b0e16a1
|
[dictionaries] Danish sub-building dictionaries
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
1d09060012
|
[fix] adjusting a few probabilities for German
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
6861c09caa
|
[addresses/dictionaries] Adding Catalan address config
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
4fa8c2aa8e
|
[addresses] Dutch cross streets
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
6e4ca716df
|
[fix] Adding sampling for French intersections
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
38e17bd1b2
|
[fix] adding sampling to Spanish intersections
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
72e647902d
|
[fix] name
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
03be909a60
|
[fix] name
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
45e069be6a
|
[dictionaries] Adding suite to Spanish dictionaries, used sometimes in Latin America, removing entre from stopwords as it's part of the intersections dictionary
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
127883facc
|
[addresses] Spanish intersections, suite
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
14f08e5991
|
[formatting] Adding aliases in formatting config, so e.g. most of the Francophone world shares France's config without needing to be the case for every French address (e.g. Belgium), generic config for continental Europe, etc.
|
2016-07-21 17:04:57 -04:00 |
|