libpostal

Author	SHA1	Message	Date
Al	80ee34cc3a	[text] adding normalization with whitespace	2016-12-10 17:50:53 -05:00
Al	c0a468d7e8	[normalization] adding a normalize_token function and some token options for deleting periods	2016-12-09 17:46:26 -05:00
Al	dfa5c8e0a6	[abbreviations] Adding ability to abbreviate within hyphenated phrases e.g. Sint-Maarten => St.-Maarten	2016-08-24 18:50:24 -04:00
Al	97a2436ad7	[tokenization] Adding two more sets to token_types for punctuation and non-alphanumerics	2016-08-02 16:24:01 -04:00
Al	75d9c31395	[text] Adding NORMALIZE_STRING_COMPOSE constant in pynormalize.c	2016-07-24 03:37:43 -04:00
Al	7b3f4e9175	[text] Adding utils.py for is_numeric/is_numeric_strict	2016-07-24 03:37:11 -04:00
Al	b9ee3be806	[phrases] Using simple string encoding/decoding for default serialize/deserialize in PhraseFilter base class	2016-07-21 17:04:57 -04:00
Al	771a360a85	[phrases] Using safe_encode/safe_decode as default trie serializer/deserializer	2016-07-21 17:04:57 -04:00
Al	4a2d266230	[phrases] adding __init__ to base PhraseFilter	2016-07-21 17:04:57 -04:00
Al	ee1aa564c4	[normalization] normalize tokens should not replace digits by default	2016-07-21 17:04:57 -04:00
Al	1fd4fbb7a2	[normalization] Adding default token options for numbers so we split alpha from numeric tokens and don't normalize digits	2016-07-21 17:04:57 -04:00
Al	d5dc34ec1d	[gazetteers] moving PHRASE to a token type	2016-07-21 17:04:57 -04:00
Al	2e15db06dd	[text] making normalize_string directly callable from Python geodata	2016-01-21 02:07:46 -05:00
Al	fa32eacdd1	[phrases] Adding Python phrase filter from address_normalizer until a Python wrapper around libpostal's trie_search is available	2016-01-17 15:45:02 -05:00
Al	58e53cab1c	[scripts] Adding the tokenize/normalize wrappers directly into the internal geodata package so pypostal can be maintained in an independent repo	2016-01-12 13:29:31 -05:00

15 Commits