Al
|
979fd16215
|
[osm] adding airports and terminals data sets with points and polygons, more file cleanup in OSM fetch script
|
2017-01-10 16:20:32 -05:00 |
|
Al
|
4bdfe5ba1d
|
[openaddresses] add Habersham County, GA
|
2017-01-10 16:19:31 -05:00 |
|
Al
|
49fdb29e16
|
[openaddresses] add Swedish municipalities of Malmö, Vaxholm, Vaxjö, and Helsingborg
|
2017-01-10 12:23:06 -05:00 |
|
Al Barrentine
|
577f26e418
|
Merge pull request #154 from openvenues/setup_datadir_functions
Setup datadir functions
|
2017-01-09 16:52:07 -05:00 |
|
Al
|
bbc91722cb
|
[version] bump version to 0.3.3
|
2017-01-09 16:14:07 -05:00 |
|
Al
|
a3506131fe
|
[build] adding libpostal_setup_datadir, libpostal_setup_parser_datadir, libpostal_setup_language_classifier_datadir functions for configuring the datadir at runtime
|
2017-01-09 16:11:26 -05:00 |
|
Al
|
953a26e54e
|
[utils] char_array_add_vjoined to stay consistent (add_* methods NUL termiante)
|
2017-01-09 16:10:07 -05:00 |
|
Al
|
7a8f94330b
|
[parser] only adding ngrams in a hyphenated word if the subword is not rare
|
2017-01-09 02:53:33 -05:00 |
|
Al
|
00cf936460
|
[openaddresses] adding Nordrhein-Westfalen, Germany
|
2017-01-08 12:48:45 -05:00 |
|
Al
|
86c7b7f3fe
|
[addresses] no longer normalizing slashes in boundary names for places that have multilingual names, etc.
|
2017-01-08 12:41:51 -05:00 |
|
Al
|
a6d94f998b
|
[addresses] stripping parentheticals in admin boundary names as sometimes cities in e.g. Switzerland are like Oberwil (ZG) in OSM
|
2017-01-08 03:43:22 -05:00 |
|
Al
|
e10c156176
|
[dictionaries] adding BL as an abbreviation for Boulevard
|
2017-01-07 20:22:03 -05:00 |
|
Al
|
828b67d4f7
|
[osm] adding some new training data for simple road names and their surrounding admin boundaries
|
2017-01-07 15:34:43 -05:00 |
|
Al Barrentine
|
a2b84a0177
|
[docs][ci skip] Adding parser label definitions to the README
|
2017-01-07 14:17:31 -05:00 |
|
Al
|
83e38d9a8c
|
[openaddresses] add OSM boundaries for Milwaukee county as many of the cities appear to be IDs
|
2017-01-07 01:42:46 -05:00 |
|
Al
|
eab629802c
|
[openaddresses] removing pre_release_downloads as they're all in master now, adding city_replacements for all data sets where OSM boundaries are used
|
2017-01-07 01:39:11 -05:00 |
|
Al
|
69f1137532
|
[openaddresses] adding city_replacements for Lake County, FL
|
2017-01-07 00:35:12 -05:00 |
|
Al
|
c025b0f7d4
|
[openaddresses] adding correct state for Glarus, Switzerland, ignoring city in Milwaukee if it's purely numeric
|
2017-01-07 00:01:46 -05:00 |
|
Al
|
d51f9dbb0e
|
[addresses] stripping unit phrases from streets in OpenAddresses as well, return value wasn't getting used before
|
2017-01-06 10:19:08 -05:00 |
|
Al
|
cfdef1788c
|
[addresses] stripping unit from street using the libpostal dictionaries in all the address data sets. Happens surprisingly often in OpenStreetMap as well as OpenAddresses
|
2017-01-06 10:06:23 -05:00 |
|
Al
|
3fbd4426b7
|
[openaddresses] adding Swiss cantons of Grigioni/Graubünden, Glarus, Uri, and Schwyz
|
2017-01-06 08:55:32 -05:00 |
|
Al
|
9c14d47f24
|
[openaddresses] adding Cambell and Pendleton County KY and San Benito County, CA
|
2017-01-06 02:41:29 -05:00 |
|
Al Barrentine
|
2b3a6f663e
|
Merge pull request #152 from rinigus/master_rpc_malloc
changes required for cross-compilation of ARM target
|
2017-01-05 17:12:51 -05:00 |
|
Al
|
321f2034d2
|
[fix] unidata file
|
2017-01-05 04:24:33 -05:00 |
|
Al
|
7a31802a04
|
[fix] also fix german-ascii transliteration on uppercase U with umlaut
|
2017-01-05 04:07:29 -05:00 |
|
Al
|
25723fcea2
|
[transliteration] making the custom rules in transliteration less repetitious and accessible from elsewhere, removing string names for common transliterators and using constants
|
2017-01-05 04:06:51 -05:00 |
|
Al
|
3fcaae3dbc
|
[openaddresses] add Canton of Solothurn, Switzerland
|
2017-01-05 02:23:20 -05:00 |
|
Al
|
4182123fa6
|
[openaddresses] adding Schaffhausen, also adding language=de for the last few cantons
|
2017-01-05 01:40:30 -05:00 |
|
Al
|
72e6bf043b
|
[openaddresses] add Basel-Stadt, Switzerland
|
2017-01-05 01:26:20 -05:00 |
|
Al
|
3d16c20d24
|
[openaddresses] add Boyd County, KY
|
2017-01-05 01:25:41 -05:00 |
|
Rinigus
|
26aeb0ebec
|
drop AC_FUNC_MALLOC and _REALLOC and check for them as regular functions; add extra cflags for scanner
|
2017-01-05 07:34:24 +02:00 |
|
Al
|
c5cca4c82f
|
[openaddresses] add Canton of Basel-Landschaft, Switzerland
|
2017-01-04 02:34:15 -05:00 |
|
Al
|
3e7042597e
|
[openaddresses] adding Jamaica countrywide to OpenAddresses config
|
2017-01-04 02:32:41 -05:00 |
|
Al
|
bcd61ffbe8
|
[formatting] moving postcode to the beginning of the address only in countries using the continental European conventions. Creates more ambiguity than is worthwhile in the US, etc. when, say, house_number is removed from a training example and the postcode is inserted first (could very easily be a house_number)
|
2017-01-03 03:39:16 -05:00 |
|
Al
|
38e147d210
|
[fix] address configs for Greek/Hebrew
|
2017-01-03 03:07:53 -05:00 |
|
Al
|
de2dffa315
|
[addresses] adding Calle to purely numeric Spanish street names in OSM as well
|
2017-01-02 23:41:01 -05:00 |
|
Al
|
ccd555d020
|
[transliteration] regenerated transliteration_scripts_data.c
|
2017-01-02 13:52:48 -05:00 |
|
Al
|
600b40d2f6
|
[transliteration] adding german-ascii transliteration to Estonian to handle umlauts (ä => ae, etc.)
|
2017-01-02 13:51:56 -05:00 |
|
Al
|
b2b7f6f155
|
[osm] add wikipedia:* to rail station exception
|
2017-01-02 13:13:42 -05:00 |
|
Al
|
a99a1e759e
|
[openaddresses] adding Rio de Janeiro, Stockholm, and Liechtenstein. Adding higher CLDR country probability for smaller countries
|
2017-01-02 03:29:36 -05:00 |
|
Al
|
77035fbdbd
|
[strings] adding utf8_is_whitespace to the header so it can be referenced from multiple files
|
2017-01-02 02:23:21 -05:00 |
|
Al
|
400ea589ef
|
[normalize] add NORMALIZE_STRING_SIMPLE_LATIN_ASCII option to pynormalize
|
2017-01-02 02:08:54 -05:00 |
|
Al
|
182976214c
|
[logging] converting most of the steps in building the transliteration table to use debug logging
|
2017-01-02 00:41:11 -05:00 |
|
Al
|
d8d3840700
|
[transliteration] constant for the html-escape transliterator
|
2017-01-02 00:40:12 -05:00 |
|
Al
|
4ad3a52fe1
|
[strings] fix lowercasing in string_utils.c
|
2017-01-01 20:08:34 -05:00 |
|
Al
|
a78937f265
|
[normalize] use the new utf8proc lowercasing (as opposed to case folding), free copies since none of the string functions operate in-place any more, add minimal HTML escaping transliterator even to ASCII text
|
2017-01-01 20:06:32 -05:00 |
|
Al
|
5c56a44faa
|
[strings] reverting to utf8proc v1.3.1, as 2.0 and above can chop off certain sequences
|
2017-01-01 20:03:23 -05:00 |
|
Al
|
fe88630f78
|
[dictionaries] regenerating address_expansion_data.c from upstream changes
|
2017-01-01 14:26:54 -05:00 |
|
Al
|
101bbcc02d
|
Merge remote-tracking branch 'origin/master' into parser-data
|
2017-01-01 14:25:37 -05:00 |
|
Travis
|
d61e90a33d
|
[auto][ci skip] Adding data files from Travis build #188
|
2017-01-01 19:20:54 +00:00 |
|