Commit Graph

  • 1b50bc4986 [numex] Croatian numex Al 2015-07-12 04:24:18 -04:00
  • a0b0034491 [dictionaries] Croatian dictionaries Al 2015-07-12 04:24:04 -04:00
  • ca6876165a [numex] Bulgarian numex Al 2015-07-12 03:38:37 -04:00
  • 55f1e6e391 [dictionaries] Bulgarian dictionaries Al 2015-07-12 03:38:28 -04:00
  • d302a6ed65 [numex] Lithuanian numex Al 2015-07-12 03:23:46 -04:00
  • bfd5155b4b [dictionaries] Lithuanian dictionaries Al 2015-07-12 03:07:27 -04:00
  • 3615fe3e51 [numex] Latvian numex Al 2015-07-12 03:07:15 -04:00
  • fbe3458705 [dictionaries] Latvian dictionaries Al 2015-07-12 03:07:06 -04:00
  • a912cd4a87 [numex] Slovenian numex Al 2015-07-12 02:33:43 -04:00
  • 5b1084ff5c [dictionaries] Slovenian dictionaries Al 2015-07-12 02:33:28 -04:00
  • 13cf8f84c6 [numex] Slovakian numex Al 2015-07-12 02:33:04 -04:00
  • 9fae6503f5 [dictionaries] Slovakian dictionaries Al 2015-07-12 02:32:50 -04:00
  • e2dc5e0cb7 [numex] Romanian numex Al 2015-07-11 04:26:47 -04:00
  • bc412d8d9c [dictionaries] Romanian dictionaries Al 2015-07-11 04:26:33 -04:00
  • 344c6a9ff9 [numex] Estonian numex Al 2015-07-11 04:09:28 -04:00
  • c396393285 [dictionaries] Estonian dictionaries Al 2015-07-11 04:09:17 -04:00
  • 5731ef4c2c [numex] Greek numex Al 2015-07-11 02:57:00 -04:00
  • a2a317192f [dictionaries] Greek dictionaries Al 2015-07-11 02:56:45 -04:00
  • 70639c306b [numex] Ukranian numex Al 2015-07-11 02:56:30 -04:00
  • f5fd447af9 [dictionaries] Ukranian dictionaries Al 2015-07-11 02:14:55 -04:00
  • 94a6952869 [numex] Hungarian numex rules Al 2015-07-11 02:14:39 -04:00
  • 04981b1466 [dictionaries] Hungarian dictionaries Al 2015-07-11 02:14:23 -04:00
  • cc46657eee [dictionaries] Additions to Russian dictionaries Al 2015-07-10 20:43:22 -04:00
  • df22000286 [numex] genders for Norwegian rules Al 2015-07-10 20:41:23 -04:00
  • 3b5634f97e [numex] genders for Czech rules Al 2015-07-10 20:41:08 -04:00
  • 7cba19d7be [dictionaries] Finnish dictionaries Al 2015-07-10 17:09:21 -04:00
  • 5cf7a2657d [dictionaries] Swedish dictionaries Al 2015-07-10 17:09:03 -04:00
  • a8d3c9aefd [numex] Korean numex rules Al 2015-07-10 17:07:48 -04:00
  • a8d84b9681 [dictionaries] Czech dictionaries (minimal) Al 2015-07-10 14:55:42 -04:00
  • 394253f1a3 [dictionaries] Galician dictionaries Al 2015-07-10 14:47:36 -04:00
  • cfa3c0f057 [dictionaries] Norwegian dictionaries Al 2015-07-10 14:47:09 -04:00
  • 033116254e [dictionaries] German concatenated suffix abbreviations have to be followed by a period, otherwise can be too ambiguous Al 2015-07-10 14:46:10 -04:00
  • 94766856a7 [dictionaries] Updates to Danish dictionaries Al 2015-07-10 14:45:52 -04:00
  • 23dae9f8d6 [numex] Czech numex rules Al 2015-07-10 14:10:30 -04:00
  • 03690b460f [numex] Norwegian numex rules Al 2015-07-10 14:10:22 -04:00
  • f91306ece5 [dictionaries] Korean dictionaries Al 2015-07-10 14:09:58 -04:00
  • 2ba8b79650 [dictionaries] Basque dictionaries Al 2015-07-10 14:09:44 -04:00
  • 4a077218a4 [dictionaries] Chinese dictionaries Al 2015-07-10 14:09:27 -04:00
  • c9412dff8e [dictionaries] Japanese dictionaries Al 2015-07-10 14:09:17 -04:00
  • ab588fe7fb [dictionaries] Danish dictionaries Al 2015-07-10 14:08:58 -04:00
  • b9736e3070 [dictionaries] Polish dictionaries Al 2015-07-10 14:08:42 -04:00
  • 1f924cea31 [dictionaries] Russian dictionaries Al 2015-07-10 14:08:29 -04:00
  • 9541c8e5dd [dictionaries] no longer assuming that we'll be stripping internal periods in non-acronyms e.g. v.le for viale in Italian Al 2015-07-09 18:47:48 -04:00
  • fbef0a15fe [geodb] Adding sparkey dependency Al 2015-07-09 15:26:11 -04:00
  • 4f1b4756d0 [geodb] Adding builder program (requires 11GB disk space and ~4GB RAM to build, but only ~300MB RAM to use after building) Al 2015-07-09 15:25:29 -04:00
  • 8889a5c0c3 [geodb] GeoDB memory allocation and I/O Al 2015-07-09 15:01:06 -04:00
  • 2d5641892a [config] lower Bloom filter error rate Al 2015-07-09 14:59:23 -04:00
  • ec1e820268 [parsing] Changing to OpenCageData repo Al 2015-07-09 13:44:14 -04:00
  • 20c6436e6d [geodisambig] Return success if admin1/admin2 IDs are 0 Al 2015-07-09 04:19:41 -04:00
  • 20303ad94f [geohash] Adding bounds checks from python-geohash Al 2015-07-09 04:13:53 -04:00
  • 722904ce59 [fix] geoname_clear needs to clear feature code as well Al 2015-07-09 03:08:52 -04:00
  • 14500f8c7e [config] Adding GeoDB default bloom filter size and error rate Al 2015-07-08 20:50:46 -04:00
  • 0e2a0aa56d [geodisambig] adding new methods to header Al 2015-07-08 19:05:08 -04:00
  • ce54a2146b [fix] geo disambiguation features Al 2015-07-08 19:03:39 -04:00
  • fc32a66d95 [fix] geonames I/O Al 2015-07-08 19:02:45 -04:00
  • 8c02073b54 [geonames] Adding country_geonames_id to both geoname and postal code structs Al 2015-07-08 18:44:14 -04:00
  • 9af0b0ab65 [geodisambig] adding a few more features to geonames disambiguation Al 2015-07-08 18:43:28 -04:00
  • e64b6c3398 [geonames] NULL language and official language canonical should have the same sort value Al 2015-07-08 17:03:51 -04:00
  • 742079cc6a [geonames] Re-generating postal/geonames fields headers Al 2015-07-08 17:02:59 -04:00
  • b76f9e47d1 [utils] max string size for int8_t and int16_t Al 2015-07-08 16:46:12 -04:00
  • c0a5607f5e [fix] Adding NUM_BOUNDARY_TYPES for enumeration purposes Al 2015-07-08 16:43:57 -04:00
  • 4a2be72350 [geonames] Adding language priorities for sorting (official language names, canonical names, abbreviations, historical) Al 2015-07-08 16:42:42 -04:00
  • 95a6845a85 [i18n] Adding regional languages as valid country languages Al 2015-07-08 14:54:00 -04:00
  • 400c23cb5a [fix] tabs Al 2015-07-08 14:53:16 -04:00
  • ef1ecb97f7 [geonames] Adding geonames_id for countries in places/postal codes. For postal codes, sorting desc by country population (10013 is a postal code in Italy but will default to US with no other information) Al 2015-07-08 13:25:03 -04:00
  • 6cc677ac0b [geonames] Adding defaults to schema and another index on country code Al 2015-07-08 13:16:01 -04:00
  • 24835fd088 [geonames] namespace specificity Al 2015-07-07 03:38:48 -04:00
  • af1a5f6213 [trie] trie_set_data_node method Al 2015-07-07 03:38:17 -04:00
  • 53908ac604 [config] Adding geonames dir as a separate #define Al 2015-07-06 17:09:02 -04:00
  • c4fd48e7f7 [config] geodb dir Al 2015-07-06 16:55:11 -04:00
  • e7a3987656 [geodisambig] renaming module Al 2015-07-06 16:53:53 -04:00
  • d7f73e62f1 [utils] Adding cstring_array_clear method Al 2015-07-06 12:48:26 -04:00
  • 0df816fd31 [geodisambig] Helper methods to add features for a given geoname/postal_code Al 2015-07-06 12:41:05 -04:00
  • 0c5e741bb6 [geonames] Adding LC_ALL environment variable for utf8 sorting Al 2015-07-06 00:39:23 -04:00
  • 6ff91fef6b [normalization] adding a normalize_string_latin method Al 2015-07-05 23:38:01 -04:00
  • acd5d07d17 [geonames] Storing NFD normalized names and sorting case-insensitive in order to group everything with the same normalized name together Al 2015-07-05 15:56:46 -04:00
  • a08d59c277 [fix] NFD normalization should be the default in normalize.c, not NFKD, as NFKD does some unwanted things like converting superscripts and the Latin-ASCII transliterator does a better, more thorough job while staying faithful to the original string Al 2015-07-05 15:28:07 -04:00
  • 47ed2e58fd [geodisambig] feature functions for GeoNames disambiguation Al 2015-07-04 10:35:56 -04:00
  • 20a8b9611d [fix] Removing feature length variables from geonames.c Al 2015-07-04 10:33:08 -04:00
  • 3f07cc6c71 [geohash] Modified geohash implementation (based on python-geohash) with no mallocs Al 2015-07-04 00:30:35 -04:00
  • f825dcb939 [geonames] Fixing admin table DDL Al 2015-07-03 05:54:41 -04:00
  • 4fd4fa7dca [fix] moving int string size constants to string_utils.h Al 2015-07-02 17:50:09 -04:00
  • 055e6d8905 [fix] typo in constant Al 2015-07-02 16:12:24 -04:00
  • e273caac22 [geonames] generated postal code TSV fields Al 2015-07-02 16:00:06 -04:00
  • fd28ee27bf [geonames] generated geonames TSV fields Al 2015-07-02 15:59:54 -04:00
  • 86b23ecca3 [fix] field name Al 2015-07-02 15:59:11 -04:00
  • 6cfbab9969 [normalization] string normalization module for tokens and full strings Al 2015-07-01 14:52:28 -04:00
  • 46e51ae91e [transliterate] no need to strdup transliterator names if they are lowercased, breaking on NUL byte Al 2015-07-01 14:51:22 -04:00
  • b58877ec6c [utils] string_is_lower/string_is_upper method Al 2015-07-01 14:49:17 -04:00
  • 58c6ff104a [fix] Russian feminine ordinals Al 2015-07-01 13:57:37 -04:00
  • d0db015667 [geodisambig] Adding new fields to geonames struct, plus I/O Al 2015-07-01 13:02:00 -04:00
  • af56c3cd09 [config] constants Al 2015-07-01 13:01:22 -04:00
  • fa643f7a3a [utf8] Moving language length constant Al 2015-06-30 19:17:20 -04:00
  • 071d6bb392 [geodisambig] Adding presence of a Wikipedia link to the GeoNames output (an unqualified entry for the name in Wikipeida usually indicates a primary meaning). Ranking ambiguous entries for each term so that the top entry should be selected if no further information is available Al 2015-06-30 18:00:07 -04:00
  • 8d64c9301e [transliteration] Re-generating transliteration data file Al 2015-06-29 15:03:56 -04:00
  • a580ed0b1b [transliteration] Adding numeric HTML escapes e.g. '&' Al 2015-06-29 15:02:34 -04:00
  • 3279b31b09 [tokenization] Adding an acronym token type for things like U.N. so we can delete internal periods on those tokens Al 2015-06-29 03:00:46 -04:00
  • 47efce4b7e [transliteration] Stopping set check loop on empty transition Al 2015-06-28 20:46:23 -04:00
  • cc0401a8d1 [utf8] Adding a boolean struct member for string_script_t return values, set to true if the string is ASCII (no transliteration needed, should be frequent for English addresses) Al 2015-06-28 19:37:53 -04:00
  • f0bf7e750c [transliteration] Fixing edge case in transliteration where a naked character fails context matching but the set-wrapped version matches Al 2015-06-28 15:19:19 -04:00