1b50bc4986[numex] Croatian numex
Al
2015-07-12 04:24:18 -04:00
a0b0034491[dictionaries] Croatian dictionaries
Al
2015-07-12 04:24:04 -04:00
ca6876165a[numex] Bulgarian numex
Al
2015-07-12 03:38:37 -04:00
55f1e6e391[dictionaries] Bulgarian dictionaries
Al
2015-07-12 03:38:28 -04:00
d302a6ed65[numex] Lithuanian numex
Al
2015-07-12 03:23:46 -04:00
bfd5155b4b[dictionaries] Lithuanian dictionaries
Al
2015-07-12 03:07:27 -04:00
3615fe3e51[numex] Latvian numex
Al
2015-07-12 03:07:15 -04:00
fbe3458705[dictionaries] Latvian dictionaries
Al
2015-07-12 03:07:06 -04:00
a912cd4a87[numex] Slovenian numex
Al
2015-07-12 02:33:43 -04:00
5b1084ff5c[dictionaries] Slovenian dictionaries
Al
2015-07-12 02:33:28 -04:00
13cf8f84c6[numex] Slovakian numex
Al
2015-07-12 02:33:04 -04:00
9fae6503f5[dictionaries] Slovakian dictionaries
Al
2015-07-12 02:32:50 -04:00
e2dc5e0cb7[numex] Romanian numex
Al
2015-07-11 04:26:47 -04:00
bc412d8d9c[dictionaries] Romanian dictionaries
Al
2015-07-11 04:26:33 -04:00
344c6a9ff9[numex] Estonian numex
Al
2015-07-11 04:09:28 -04:00
c396393285[dictionaries] Estonian dictionaries
Al
2015-07-11 04:09:17 -04:00
5731ef4c2c[numex] Greek numex
Al
2015-07-11 02:57:00 -04:00
a2a317192f[dictionaries] Greek dictionaries
Al
2015-07-11 02:56:45 -04:00
70639c306b[numex] Ukranian numex
Al
2015-07-11 02:56:30 -04:00
f5fd447af9[dictionaries] Ukranian dictionaries
Al
2015-07-11 02:14:55 -04:00
94a6952869[numex] Hungarian numex rules
Al
2015-07-11 02:14:39 -04:00
04981b1466[dictionaries] Hungarian dictionaries
Al
2015-07-11 02:14:23 -04:00
cc46657eee[dictionaries] Additions to Russian dictionaries
Al
2015-07-10 20:43:22 -04:00
df22000286[numex] genders for Norwegian rules
Al
2015-07-10 20:41:23 -04:00
3b5634f97e[numex] genders for Czech rules
Al
2015-07-10 20:41:08 -04:00
7cba19d7be[dictionaries] Finnish dictionaries
Al
2015-07-10 17:09:21 -04:00
5cf7a2657d[dictionaries] Swedish dictionaries
Al
2015-07-10 17:09:03 -04:00
a8d3c9aefd[numex] Korean numex rules
Al
2015-07-10 17:07:48 -04:00
a8d84b9681[dictionaries] Czech dictionaries (minimal)
Al
2015-07-10 14:55:42 -04:00
394253f1a3[dictionaries] Galician dictionaries
Al
2015-07-10 14:47:36 -04:00
cfa3c0f057[dictionaries] Norwegian dictionaries
Al
2015-07-10 14:47:09 -04:00
033116254e[dictionaries] German concatenated suffix abbreviations have to be followed by a period, otherwise can be too ambiguous
Al
2015-07-10 14:46:10 -04:00
94766856a7[dictionaries] Updates to Danish dictionaries
Al
2015-07-10 14:45:52 -04:00
23dae9f8d6[numex] Czech numex rules
Al
2015-07-10 14:10:30 -04:00
03690b460f[numex] Norwegian numex rules
Al
2015-07-10 14:10:22 -04:00
f91306ece5[dictionaries] Korean dictionaries
Al
2015-07-10 14:09:58 -04:00
2ba8b79650[dictionaries] Basque dictionaries
Al
2015-07-10 14:09:44 -04:00
4a077218a4[dictionaries] Chinese dictionaries
Al
2015-07-10 14:09:27 -04:00
c9412dff8e[dictionaries] Japanese dictionaries
Al
2015-07-10 14:09:17 -04:00
ab588fe7fb[dictionaries] Danish dictionaries
Al
2015-07-10 14:08:58 -04:00
b9736e3070[dictionaries] Polish dictionaries
Al
2015-07-10 14:08:42 -04:00
1f924cea31[dictionaries] Russian dictionaries
Al
2015-07-10 14:08:29 -04:00
9541c8e5dd[dictionaries] no longer assuming that we'll be stripping internal periods in non-acronyms e.g. v.le for viale in Italian
Al
2015-07-09 18:47:48 -04:00
fbef0a15fe[geodb] Adding sparkey dependency
Al
2015-07-09 15:26:11 -04:00
4f1b4756d0[geodb] Adding builder program (requires 11GB disk space and ~4GB RAM to build, but only ~300MB RAM to use after building)
Al
2015-07-09 15:25:29 -04:00
8889a5c0c3[geodb] GeoDB memory allocation and I/O
Al
2015-07-09 15:01:06 -04:00
2d5641892a[config] lower Bloom filter error rate
Al
2015-07-09 14:59:23 -04:00
ec1e820268[parsing] Changing to OpenCageData repo
Al
2015-07-09 13:44:14 -04:00
20c6436e6d[geodisambig] Return success if admin1/admin2 IDs are 0
Al
2015-07-09 04:19:41 -04:00
20303ad94f[geohash] Adding bounds checks from python-geohash
Al
2015-07-09 04:13:53 -04:00
722904ce59[fix] geoname_clear needs to clear feature code as well
Al
2015-07-09 03:08:52 -04:00
14500f8c7e[config] Adding GeoDB default bloom filter size and error rate
Al
2015-07-08 20:50:46 -04:00
0e2a0aa56d[geodisambig] adding new methods to header
Al
2015-07-08 19:05:08 -04:00
ce54a2146b[fix] geo disambiguation features
Al
2015-07-08 19:03:39 -04:00
fc32a66d95[fix] geonames I/O
Al
2015-07-08 19:02:45 -04:00
8c02073b54[geonames] Adding country_geonames_id to both geoname and postal code structs
Al
2015-07-08 18:44:14 -04:00
9af0b0ab65[geodisambig] adding a few more features to geonames disambiguation
Al
2015-07-08 18:43:28 -04:00
e64b6c3398[geonames] NULL language and official language canonical should have the same sort value
Al
2015-07-08 17:03:51 -04:00
742079cc6a[geonames] Re-generating postal/geonames fields headers
Al
2015-07-08 17:02:59 -04:00
b76f9e47d1[utils] max string size for int8_t and int16_t
Al
2015-07-08 16:46:12 -04:00
c0a5607f5e[fix] Adding NUM_BOUNDARY_TYPES for enumeration purposes
Al
2015-07-08 16:43:57 -04:00
4a2be72350[geonames] Adding language priorities for sorting (official language names, canonical names, abbreviations, historical)
Al
2015-07-08 16:42:42 -04:00
95a6845a85[i18n] Adding regional languages as valid country languages
Al
2015-07-08 14:54:00 -04:00
400c23cb5a[fix] tabs
Al
2015-07-08 14:53:16 -04:00
ef1ecb97f7[geonames] Adding geonames_id for countries in places/postal codes. For postal codes, sorting desc by country population (10013 is a postal code in Italy but will default to US with no other information)
Al
2015-07-08 13:25:03 -04:00
6cc677ac0b[geonames] Adding defaults to schema and another index on country code
Al
2015-07-08 13:16:01 -04:00
24835fd088[geonames] namespace specificity
Al
2015-07-07 03:38:48 -04:00
af1a5f6213[trie] trie_set_data_node method
Al
2015-07-07 03:38:17 -04:00
53908ac604[config] Adding geonames dir as a separate #define
Al
2015-07-06 17:09:02 -04:00
c4fd48e7f7[config] geodb dir
Al
2015-07-06 16:55:11 -04:00
e7a3987656[geodisambig] renaming module
Al
2015-07-06 16:53:53 -04:00
d7f73e62f1[utils] Adding cstring_array_clear method
Al
2015-07-06 12:48:26 -04:00
0df816fd31[geodisambig] Helper methods to add features for a given geoname/postal_code
Al
2015-07-06 12:41:05 -04:00
0c5e741bb6[geonames] Adding LC_ALL environment variable for utf8 sorting
Al
2015-07-06 00:39:23 -04:00
6ff91fef6b[normalization] adding a normalize_string_latin method
Al
2015-07-05 23:38:01 -04:00
acd5d07d17[geonames] Storing NFD normalized names and sorting case-insensitive in order to group everything with the same normalized name together
Al
2015-07-05 15:56:46 -04:00
a08d59c277[fix] NFD normalization should be the default in normalize.c, not NFKD, as NFKD does some unwanted things like converting superscripts and the Latin-ASCII transliterator does a better, more thorough job while staying faithful to the original string
Al
2015-07-05 15:28:07 -04:00
47ed2e58fd[geodisambig] feature functions for GeoNames disambiguation
Al
2015-07-04 10:35:56 -04:00
20a8b9611d[fix] Removing feature length variables from geonames.c
Al
2015-07-04 10:33:08 -04:00
3f07cc6c71[geohash] Modified geohash implementation (based on python-geohash) with no mallocs
Al
2015-07-04 00:30:35 -04:00
f825dcb939[geonames] Fixing admin table DDL
Al
2015-07-03 05:54:41 -04:00
4fd4fa7dca[fix] moving int string size constants to string_utils.h
Al
2015-07-02 17:50:09 -04:00
055e6d8905[fix] typo in constant
Al
2015-07-02 16:12:24 -04:00
e273caac22[geonames] generated postal code TSV fields
Al
2015-07-02 16:00:06 -04:00
fd28ee27bf[geonames] generated geonames TSV fields
Al
2015-07-02 15:59:54 -04:00
86b23ecca3[fix] field name
Al
2015-07-02 15:59:11 -04:00
6cfbab9969[normalization] string normalization module for tokens and full strings
Al
2015-07-01 14:52:28 -04:00
46e51ae91e[transliterate] no need to strdup transliterator names if they are lowercased, breaking on NUL byte
Al
2015-07-01 14:51:22 -04:00
b58877ec6c[utils] string_is_lower/string_is_upper method
Al
2015-07-01 14:49:17 -04:00
58c6ff104a[fix] Russian feminine ordinals
Al
2015-07-01 13:57:37 -04:00
d0db015667[geodisambig] Adding new fields to geonames struct, plus I/O
Al
2015-07-01 13:02:00 -04:00
af56c3cd09[config] constants
Al
2015-07-01 13:01:22 -04:00
fa643f7a3a[utf8] Moving language length constant
Al
2015-06-30 19:17:20 -04:00
071d6bb392[geodisambig] Adding presence of a Wikipedia link to the GeoNames output (an unqualified entry for the name in Wikipeida usually indicates a primary meaning). Ranking ambiguous entries for each term so that the top entry should be selected if no further information is available
Al
2015-06-30 18:00:07 -04:00
8d64c9301e[transliteration] Re-generating transliteration data file
Al
2015-06-29 15:03:56 -04:00
a580ed0b1b[transliteration] Adding numeric HTML escapes e.g. '&'
Al
2015-06-29 15:02:34 -04:00
3279b31b09[tokenization] Adding an acronym token type for things like U.N. so we can delete internal periods on those tokens
Al
2015-06-29 03:00:46 -04:00
47efce4b7e[transliteration] Stopping set check loop on empty transition
Al
2015-06-28 20:46:23 -04:00
cc0401a8d1[utf8] Adding a boolean struct member for string_script_t return values, set to true if the string is ASCII (no transliteration needed, should be frequent for English addresses)
Al
2015-06-28 19:37:53 -04:00
f0bf7e750c[transliteration] Fixing edge case in transliteration where a naked character fails context matching but the set-wrapped version matches
Al
2015-06-28 15:19:19 -04:00