Al
|
eab629802c
|
[openaddresses] removing pre_release_downloads as they're all in master now, adding city_replacements for all data sets where OSM boundaries are used
|
2017-01-07 01:39:11 -05:00 |
|
Al
|
69f1137532
|
[openaddresses] adding city_replacements for Lake County, FL
|
2017-01-07 00:35:12 -05:00 |
|
Al
|
c025b0f7d4
|
[openaddresses] adding correct state for Glarus, Switzerland, ignoring city in Milwaukee if it's purely numeric
|
2017-01-07 00:01:46 -05:00 |
|
Al
|
d51f9dbb0e
|
[addresses] stripping unit phrases from streets in OpenAddresses as well, return value wasn't getting used before
|
2017-01-06 10:19:08 -05:00 |
|
Al
|
cfdef1788c
|
[addresses] stripping unit from street using the libpostal dictionaries in all the address data sets. Happens surprisingly often in OpenStreetMap as well as OpenAddresses
|
2017-01-06 10:06:23 -05:00 |
|
Al
|
3fbd4426b7
|
[openaddresses] adding Swiss cantons of Grigioni/Graubünden, Glarus, Uri, and Schwyz
|
2017-01-06 08:55:32 -05:00 |
|
Al
|
9c14d47f24
|
[openaddresses] adding Cambell and Pendleton County KY and San Benito County, CA
|
2017-01-06 02:41:29 -05:00 |
|
Al Barrentine
|
2b3a6f663e
|
Merge pull request #152 from rinigus/master_rpc_malloc
changes required for cross-compilation of ARM target
|
2017-01-05 17:12:51 -05:00 |
|
Al
|
321f2034d2
|
[fix] unidata file
|
2017-01-05 04:24:33 -05:00 |
|
Al
|
7a31802a04
|
[fix] also fix german-ascii transliteration on uppercase U with umlaut
|
2017-01-05 04:07:29 -05:00 |
|
Al
|
25723fcea2
|
[transliteration] making the custom rules in transliteration less repetitious and accessible from elsewhere, removing string names for common transliterators and using constants
|
2017-01-05 04:06:51 -05:00 |
|
Al
|
3fcaae3dbc
|
[openaddresses] add Canton of Solothurn, Switzerland
|
2017-01-05 02:23:20 -05:00 |
|
Al
|
4182123fa6
|
[openaddresses] adding Schaffhausen, also adding language=de for the last few cantons
|
2017-01-05 01:40:30 -05:00 |
|
Al
|
72e6bf043b
|
[openaddresses] add Basel-Stadt, Switzerland
|
2017-01-05 01:26:20 -05:00 |
|
Al
|
3d16c20d24
|
[openaddresses] add Boyd County, KY
|
2017-01-05 01:25:41 -05:00 |
|
Rinigus
|
26aeb0ebec
|
drop AC_FUNC_MALLOC and _REALLOC and check for them as regular functions; add extra cflags for scanner
|
2017-01-05 07:34:24 +02:00 |
|
Al
|
c5cca4c82f
|
[openaddresses] add Canton of Basel-Landschaft, Switzerland
|
2017-01-04 02:34:15 -05:00 |
|
Al
|
3e7042597e
|
[openaddresses] adding Jamaica countrywide to OpenAddresses config
|
2017-01-04 02:32:41 -05:00 |
|
Al
|
bcd61ffbe8
|
[formatting] moving postcode to the beginning of the address only in countries using the continental European conventions. Creates more ambiguity than is worthwhile in the US, etc. when, say, house_number is removed from a training example and the postcode is inserted first (could very easily be a house_number)
|
2017-01-03 03:39:16 -05:00 |
|
Al
|
38e147d210
|
[fix] address configs for Greek/Hebrew
|
2017-01-03 03:07:53 -05:00 |
|
Al
|
de2dffa315
|
[addresses] adding Calle to purely numeric Spanish street names in OSM as well
|
2017-01-02 23:41:01 -05:00 |
|
Al
|
ccd555d020
|
[transliteration] regenerated transliteration_scripts_data.c
|
2017-01-02 13:52:48 -05:00 |
|
Al
|
600b40d2f6
|
[transliteration] adding german-ascii transliteration to Estonian to handle umlauts (ä => ae, etc.)
|
2017-01-02 13:51:56 -05:00 |
|
Al
|
b2b7f6f155
|
[osm] add wikipedia:* to rail station exception
|
2017-01-02 13:13:42 -05:00 |
|
Al
|
a99a1e759e
|
[openaddresses] adding Rio de Janeiro, Stockholm, and Liechtenstein. Adding higher CLDR country probability for smaller countries
|
2017-01-02 03:29:36 -05:00 |
|
Al
|
77035fbdbd
|
[strings] adding utf8_is_whitespace to the header so it can be referenced from multiple files
|
2017-01-02 02:23:21 -05:00 |
|
Al
|
400ea589ef
|
[normalize] add NORMALIZE_STRING_SIMPLE_LATIN_ASCII option to pynormalize
|
2017-01-02 02:08:54 -05:00 |
|
Al
|
182976214c
|
[logging] converting most of the steps in building the transliteration table to use debug logging
|
2017-01-02 00:41:11 -05:00 |
|
Al
|
d8d3840700
|
[transliteration] constant for the html-escape transliterator
|
2017-01-02 00:40:12 -05:00 |
|
Al
|
4ad3a52fe1
|
[strings] fix lowercasing in string_utils.c
|
2017-01-01 20:08:34 -05:00 |
|
Al
|
a78937f265
|
[normalize] use the new utf8proc lowercasing (as opposed to case folding), free copies since none of the string functions operate in-place any more, add minimal HTML escaping transliterator even to ASCII text
|
2017-01-01 20:06:32 -05:00 |
|
Al
|
5c56a44faa
|
[strings] reverting to utf8proc v1.3.1, as 2.0 and above can chop off certain sequences
|
2017-01-01 20:03:23 -05:00 |
|
Al
|
fe88630f78
|
[dictionaries] regenerating address_expansion_data.c from upstream changes
|
2017-01-01 14:26:54 -05:00 |
|
Al
|
101bbcc02d
|
Merge remote-tracking branch 'origin/master' into parser-data
|
2017-01-01 14:25:37 -05:00 |
|
Travis
|
d61e90a33d
|
[auto][ci skip] Adding data files from Travis build #188
|
2017-01-01 19:20:54 +00:00 |
|
Al Barrentine
|
6048d6a71e
|
Merge pull request #149 from iestynpryce/master
Enhanced the Welsh (cy) language dictionaries.
|
2017-01-01 14:11:16 -05:00 |
|
Al
|
0b5cc96654
|
[transliteration] add decompose option when stripping accents
|
2017-01-01 13:54:20 -05:00 |
|
Al
|
7d6c85aeec
|
[fix] new string tree iterator, don't decrement permutations on rollovers
|
2017-01-01 13:34:08 -05:00 |
|
Al
|
1780c5e053
|
[fix] moving enum
|
2016-12-31 13:01:57 -05:00 |
|
Iestyn Pryce
|
d8ee43156e
|
Enhanced the Welsh (cy) language dictionaries.
|
2016-12-31 09:46:58 +00:00 |
|
Al
|
475aa3dbfa
|
[strings] fixing and simplifying string tree iterator. This version is inspired by Python's itertools.product (itertoolsmodule.c has so many goodies)
|
2016-12-31 03:22:27 -05:00 |
|
Al
|
261ec3888a
|
[strings] header changes for new utf8 lower/upper functions
|
2016-12-31 03:20:43 -05:00 |
|
Al
|
58b063b632
|
[strings] making string_tree_iterator_done more meaningful (returns true if the iterator has no paths left to traverse)
|
2016-12-31 00:54:36 -05:00 |
|
Al
|
8978000320
|
[strings] adding latest utf8proc, new functions for utf8_lower (instead of case folding) and utf8_upper, and a utf8_is_whitespace that takes things like tabs into account
|
2016-12-31 00:52:12 -05:00 |
|
Al
|
db16e656ca
|
[parser/cli] adding .print_features option in address_parser client for debugging
|
2016-12-31 00:20:35 -05:00 |
|
Al
|
bdb51a244e
|
[phrases] fix case in trie search when searching for tokens in a string tail. If we're on the last token in a sequenence and the token matches the tail, check that the tail is complete, and if so return the match before exiting the loop. Affects multiword phrases that tend to appear toward the end of a sequence (long country names like "United States of America", etc.)
|
2016-12-29 16:17:09 -05:00 |
|
Al
|
2d077699e6
|
[places] adding is_in property to the set of tags for the places index. This may allow us to make more granular exceptions for node-based places that are actually suburbs but classified as {hamlet, village, locality, town}, etc. if the is_in contains a city that's also a boundary or nearby point
|
2016-12-29 14:04:13 -05:00 |
|
Al
|
cad57b94b2
|
[boundaries] mapping place=hamlet to suburb for all of Malaysia. place=village becomes suburb as well in the urban core
|
2016-12-29 14:01:57 -05:00 |
|
Al
|
21a2a7419a
|
[addresses] only add village as city component if no city can be found in the area
|
2016-12-29 13:41:05 -05:00 |
|
Al
|
8080e16791
|
[openaddresses] adding Joinville, Brasil and adding OSM boundaries for Brasilian address data sets
|
2016-12-29 13:27:49 -05:00 |
|