Commit Graph

2151 Commits

Author SHA1 Message Date
Al
2831b70747 [dictionaries] Norwegian sub-building dictionaries 2016-07-21 17:04:57 -04:00
Al
b5d4dd6f37 [tokenization] Including full-width numbers in numeric tokens 2016-07-21 17:04:57 -04:00
Al
02d40c23a6 [numex] Norwegian ordinal indicators 2016-07-21 17:04:57 -04:00
Al
0136c88629 [addresses] Updates to Danish sub-building config 2016-07-21 17:04:57 -04:00
Al
5834f6b8ed [dictionaries] Updates to Danish sub-building dictionaries 2016-07-21 17:04:57 -04:00
Al
23736f2650 [fix] return None if there are no ordinal suffixes for a given language 2016-07-21 17:04:57 -04:00
Al
a6da72a831 [fix] addr:place= 2016-07-21 17:04:57 -04:00
Al
ca88ff7f73 [osm] Adding railway stations to venues/addresses data sets 2016-07-21 17:04:57 -04:00
Al
b22d30cb52 [addresses] Adding Danish config to parsed configs 2016-07-21 17:04:57 -04:00
Al
003c95f9eb [formatting] Adding Danish config to formatter and adjusting continental European template insertions 2016-07-21 17:04:57 -04:00
Al
b8ae1ad61d [addresses] Danish address config 2016-07-21 17:04:57 -04:00
Al
6f5b0e16a1 [dictionaries] Danish sub-building dictionaries 2016-07-21 17:04:57 -04:00
Al
1d09060012 [fix] adjusting a few probabilities for German 2016-07-21 17:04:57 -04:00
Al
6861c09caa [addresses/dictionaries] Adding Catalan address config 2016-07-21 17:04:57 -04:00
Al
4fa8c2aa8e [addresses] Dutch cross streets 2016-07-21 17:04:57 -04:00
Al
6e4ca716df [fix] Adding sampling for French intersections 2016-07-21 17:04:57 -04:00
Al
38e17bd1b2 [fix] adding sampling to Spanish intersections 2016-07-21 17:04:57 -04:00
Al
72e647902d [fix] name 2016-07-21 17:04:57 -04:00
Al
03be909a60 [fix] name 2016-07-21 17:04:57 -04:00
Al
45e069be6a [dictionaries] Adding suite to Spanish dictionaries, used sometimes in Latin America, removing entre from stopwords as it's part of the intersections dictionary 2016-07-21 17:04:57 -04:00
Al
127883facc [addresses] Spanish intersections, suite 2016-07-21 17:04:57 -04:00
Al
14f08e5991 [formatting] Adding aliases in formatting config, so e.g. most of the Francophone world shares France's config without needing to be the case for every French address (e.g. Belgium), generic config for continental Europe, etc. 2016-07-21 17:04:57 -04:00
Al
75e9d94684 [dictionaries] Adding case postale to French dictionaries 2016-07-21 17:04:57 -04:00
Al
ad7ef082a5 [dictionaries] extended Dutch dictionaries 2016-07-21 17:04:57 -04:00
Al
b8a9d15d41 [addresses] Dutch address config 2016-07-21 17:04:57 -04:00
Al
88762a7778 [addresses] German address config numbered units 2016-07-21 17:04:57 -04:00
Al
a456262aca [addresses] German categories and cross streets 2016-07-21 17:04:57 -04:00
Al
dd7ef6fabf [dictionaries] Making new component for near/nearby prepositions 2016-07-21 17:04:57 -04:00
Al
755976bc16 [dictionaries] Adding new dictionary for prepositions like near/nearby 2016-07-21 17:04:57 -04:00
Al
ec44fdaf79 [addresses] case postale for Canada/Switzerland 2016-07-21 17:04:57 -04:00
Al
ca39272d18 [addresses] German address config 2016-07-21 17:04:57 -04:00
Al
22be892635 [dictionaries] Updates to German dictionaries 2016-07-21 17:04:57 -04:00
Al
0bbced4966 [fix] subdir config in OpenAddresses formatter 2016-07-21 17:04:57 -04:00
Al
fdba7b138d [addresses] Fixes for English/French Canadian apartment numbers 2016-07-21 17:04:57 -04:00
Al
7d5d54bd29 [formatting] Territories use parent country's template insertion probabilities 2016-07-21 17:04:57 -04:00
Al
77a4476b8e [openaddresses] CLDR country names for OpenAddresses training set 2016-07-21 17:04:57 -04:00
Al
7d62a3a762 [fix] gauche 2016-07-21 17:04:57 -04:00
Al
afa58e6edb [openaddresses] Removing New Zealand city as the field is not specific enough and may conflict with OSM names, needs to be reverse geocoded. Adding cldr country probabilities so we can add localized names/codes given the country 2016-07-21 17:04:57 -04:00
Al
e91b318121 [addresses] French address levels alphanumeric 2016-07-21 17:04:57 -04:00
Al
9059c2af60 [addresses] Don't generate sub-building components at all if there's no house number 2016-07-21 17:04:57 -04:00
Al
9c090302f7 [addresses] Topological sort of address component dependencies so they get checked/removed in order 2016-07-21 17:04:57 -04:00
Al
cd7cd292b7 [states] State abbreviations for Brazil and Mexico 2016-07-21 17:04:57 -04:00
Al
90a2f2b2e0 [parser] road has no dependencies 2016-07-21 17:04:57 -04:00
Al
29d16c9c80 [openaddresses] Country code for Belgium, removing Flanders as it has encoding issues, removing region from New Zealand formats as it appears to be conflated with districts 2016-07-21 17:04:57 -04:00
Al
419f5961a5 [fix] unused var 2016-07-21 17:04:57 -04:00
Al
7612e93fdf [addresses] French address config 2016-07-21 17:04:57 -04:00
Al
4b28791bb1 [addresses] Spanish PO box probabilities 2016-07-21 17:04:57 -04:00
Al
a57ace0be0 [openaddresses] OpenAddresses training script 2016-07-21 17:04:57 -04:00
Al
64824b90a9 [openaddresses] Only adding units for Australia, as they're known to contain both designator and number. US units seem to often have simple numbers/letters for the unit field 2016-07-21 17:04:57 -04:00
Al
584a4e0ee8 [openaddresses] Added components via OA config 2016-07-21 17:04:57 -04:00