aeddcb5606[openaddresses] OpenAddresses config specifying a few files
Al
2016-05-31 01:40:21 -04:00
fb80a345c5[openaddresses] Fetch script for OpenAddresses
Al
2016-05-31 01:39:04 -04:00
4360f8b698[addresses] Making address_language a classmethod
Al
2016-05-31 01:20:05 -04:00
4e66f81d1b[intersections] Only requiring a tag to share at least two ways
Al
2016-05-30 23:10:04 -04:00
6f34559f2d[intersections] Adding intersections to config
Al
2016-05-30 23:08:00 -04:00
ce55fb2a34[fix] name
Al
2016-05-30 23:06:45 -04:00
27d2a05a27[fix] input file
Al
2016-05-30 22:12:56 -04:00
eff7ad9163[fix] args
Al
2016-05-30 22:12:39 -04:00
b1c81f9405[fix] add ways db dir
Al
2016-05-30 22:07:01 -04:00
01fd66f4eb[fix] name
Al
2016-05-30 22:01:17 -04:00
2dcdc741a2[fix] logging for intersections data
Al
2016-05-30 22:00:28 -04:00
ebd8101b57[fix] import
Al
2016-05-30 22:00:14 -04:00
fcf55720dc[fix] import
Al
2016-05-30 21:58:12 -04:00
3825c36523[intersections] intersections training data
Al
2016-05-30 21:50:45 -04:00
8cb0c5ee8b[intersections] Adding places to intersection template, intersection phrase generator
Al
2016-05-30 21:07:14 -04:00
006d15dbac[fix] import
Al
2016-05-30 14:53:55 -04:00
5c92185e71[tokenization] Reverting commit for tokenizing initial/final apostrophes as part of words as it may be more effective to handle during post-processing
Al
2016-05-30 11:59:37 -04:00
b23f07b679[parser] Using new geonames designations in parser features
Al
2016-05-29 01:40:45 -04:00
bbddfe25bf[parser] Using NFC normalization for parser as well, @ sign not defined as separator since it may also be used in intersections
Al
2016-05-29 01:37:38 -04:00
1ac077914b[geodb] Adding separate bitset for geonames place types and using NFC normalization instead of NFD (requires retraining)
Al
2016-05-29 01:36:00 -04:00
1d1ada1bc1[normalize] Adding NORMALIZE_STRING_COMPOSE for NFC unicode normalization
Al
2016-05-28 19:25:12 -04:00
1fd57fdda3[tokenization] Adding ability to tokenize 's Gravenhage
Al
2016-05-28 19:24:19 -04:00
514aaf7377[fix] warnings/size_t in libpostal.c
Al
2016-05-28 19:19:31 -04:00
c0e8578b9c[gazetteers] Adding new gazetteer types/address components
Al
2016-05-28 19:19:18 -04:00
acd97a0081[dictionaries] Adding letra to Spanish numbered unit dictionaries
Al
2016-05-28 19:15:02 -04:00
bac86be6a3[dictionaries] Adding new dictionary types to generator script
Al
2016-05-28 17:16:43 -04:00
cff23c77ab[boundaries] Adding Bucharest sectors as city_district
Al
2016-05-27 20:22:56 -04:00
5e0e22a666[dictionaries] More dictionary refactoring
Al
2016-05-27 19:40:20 -04:00
5590c89a5e[addresses] Allowing null_phrase_probability for alpha, and alpha+digits instead of just for ordinals (mostly for Spain)
Al
2016-05-27 13:40:38 -04:00
bdd6d99f56[addresses] Adding increasing null_phrase_probability for plain numerics in Spain so things like 2o B make it into the training data
Al
2016-05-27 13:37:43 -04:00
cc453cfbbd[places] setting probability of including island to 0.5 for Hawaii, 0.8 seems too high given all the Honolulu, HI addresses (not often seen as Honolulu, Oahu, HI)
Al
2016-05-27 11:32:52 -04:00
f69d9e2e1c[dictionaries] Italian CAP abbreviations
Al
2016-05-27 11:31:16 -04:00
fc96cf145f[dictionaries] Russian place names
Al
2016-05-27 11:28:50 -04:00
ec0df1410b[dictionaries] Adding more fleshed out Greek dictionaries from a recent Nominatim NameFinder wiki update
Al
2016-05-27 11:28:23 -04:00
dccbdc4ccc[dictionaries] Refactoring existing unit_types/level_types dictionaries to use the new more granular dictionary structure
Al
2016-05-27 11:27:34 -04:00
572759885f[parser] Sample chain store alternate names from the cross-language dictionary
Al
2016-05-26 12:09:10 -04:00
5daa64faef[parser] Fixing config keys so OSM streets/venues get abbreviated. Selecting namespaced address fields in cases like Brussels or Hong Kong where everything is bilingual. Adding the ability to pass a known language into address component expansion
Al
2016-05-26 12:05:46 -04:00
206a471732[fix] loading transliteration module in address_parser_test.c as well
Al
2016-05-25 19:54:01 -04:00
34f5d833a2[fix] ON needs to be quotes in YAML, uppercase Yukon abbreviation
Al
2016-05-25 19:12:15 -04:00
f59150b047[fix] cstring_array_split calls
Al
2016-05-25 17:58:30 -04:00
5065917f41[fix] brace
Al
2016-05-25 17:52:00 -04:00
679d3efcdc[parser] Ignore multiple spaces in parser input post-normalization. If normalizing the string creates several distinct tokens (namely in Vulgar fractions e.g. ½ => 1/2), add all the sub-tokens with the same label as the parent
Al
2016-05-25 17:50:29 -04:00
370744ccfd[utils] Adding cstring_array_split_ignore_consecutive
Al
2016-05-25 17:07:20 -04:00
5c7d24c71b[fix] calls and NULL checks
Al
2016-05-25 15:50:53 -04:00
349df20720[fix] tokenized_string_t should copy its source string
Al
2016-05-25 15:47:57 -04:00
00784a897d[fix] Need to load transliteration module for Latin-ASCII normalization
Al
2016-05-25 15:25:34 -04:00
bf50d27b0e[places] Adding Town of to English prefixes
Al
2016-05-25 11:23:31 -04:00
5a88294dbc[parser] lower full-name probability for states
Al
2016-05-25 00:47:36 -04:00
5377a831ab[fix] use simple language code if language_script cannot be found
Al
2016-05-24 19:49:08 -04:00
a4064ecd02[fix] global formatter config
Al
2016-05-24 19:44:40 -04:00
3661a1e5eb[fix] config key name
Al
2016-05-24 19:39:12 -04:00
26bbd2916b[fix] neighborhood reverse geocoder using the new OSM definitions module which keeps track of whatever the data fetching script defines as being a valid {neighborhood, admin boundary, etc.}
Al
2016-05-24 19:27:22 -04:00
1a66fc3396[boundaries] lines sharing a point are added to the polygon head-to-tail, reversing the node order as needed, produces accurate OSM polygons for reverse geocoding lookups
Al
2016-05-24 19:24:37 -04:00
206cd56cd2[fix] moving language code replacements out of address components
Al
2016-05-24 16:55:46 -04:00
c4aebeebc3[boundaries] admin_level=8 is city_district in Japan
Al
2016-05-24 16:53:42 -04:00
bdb6bb03e3[formatting] Moving language country overrides to formatter config so actual language is retained
Al
2016-05-24 16:52:08 -04:00
97582e9c64[fix] place=municipality
Al
2016-05-24 15:35:33 -04:00
6af06d904a[fix] OSM neighborhood ids
Al
2016-05-24 15:13:07 -04:00
c4eab01176[fix] Adding basic Han numeral replacement to neighborhood deduping
Al
2016-05-24 14:55:54 -04:00
a5a24fb3b9[fix] component bitsets
Al
2016-05-24 13:07:32 -04:00
cf2bbcb4e0[fix] language format changes only apply to local languages
Al
2016-05-24 12:59:32 -04:00
bb2da53311[formatting] Increase probability of postcode before city
Al
2016-05-24 12:21:04 -04:00
aedb249ad7[languages] Use English formats for Romanized CJK
Al
2016-05-24 12:13:58 -04:00
7186cf13de[fix] floor samples
Al
2016-05-24 11:16:57 -04:00
eb83ae91cb[fix] Don't remove chome from Japanese, as the neighborhoods are usually just plain numbers
Al
2016-05-23 18:17:04 -04:00
028b7a460e[fix] args
Al
2016-05-23 17:42:34 -04:00
48a41eaceb[fix] US/Canada probabilities for industrial/commercial
Al
2016-05-23 16:22:27 -04:00
f2f98043ab[boundaries] Adding CP and civil parish to English place suffixes
Al
2016-05-23 15:47:57 -04:00
32e017a3ab[osm] Venue name depends on one of {house_number, road, suburb, city_district, city, postcode}
Al
2016-05-23 15:46:59 -04:00
5f78d4f3a0[fix] Spanish office probabilities
Al
2016-05-23 15:35:55 -04:00
698804b230[fix] floors
Al
2016-05-23 15:18:10 -04:00
b8e43fa7f8[fix] args again
Al
2016-05-23 15:01:58 -04:00
d6c11dde0f[fix] args
Al
2016-05-23 14:59:22 -04:00
1e2ffd9847[subdivisions/buildings] Adding subdivisions and buildings rtree to training data for getting building height, zone
Al
2016-05-23 14:51:44 -04:00
dbc41a931b[subdivisions] Adding zone types
Al
2016-05-23 14:45:55 -04:00
edff5b9730[fix] removing unnecessary vars
Al
2016-05-23 13:04:25 -04:00
b0f49db9be[fix] all_names returns a list not a set
Al
2016-05-23 13:04:00 -04:00
f20cff3b2a[osm] venue names
Al
2016-05-23 12:51:28 -04:00
85b3532333[fix] language disambiguation
Al
2016-05-23 11:54:36 -04:00