libpostal

Author	SHA1	Message	Date
Al	46141a6c36	[normalize] Adding an option when normalizing tokens to split tokens of the form [\w]+[\.\-]?[\d]+ for cases like I35, CR123, R-66, RN.7, etc. where the alpha component is an expansion	2015-08-02 14:34:36 -06:00
Al	551904d202	[normalize] cstring_array instead of string_tree for token-based normalization	2015-07-28 19:09:50 -04:00
Al	053b987d58	[normalize] adding an option for string trimming in normalize	2015-07-27 01:59:14 -04:00
Al	a38b924c5d	[fix] add_token_alternatives	2015-07-21 17:26:59 -04:00
Al	6ff91fef6b	[normalization] adding a normalize_string_latin method	2015-07-05 23:38:01 -04:00
Al	a08d59c277	[fix] NFD normalization should be the default in normalize.c, not NFKD, as NFKD does some unwanted things like converting superscripts and the Latin-ASCII transliterator does a better, more thorough job while staying faithful to the original string	2015-07-05 15:28:07 -04:00
Al	6cfbab9969	[normalization] string normalization module for tokens and full strings	2015-07-01 14:52:28 -04:00