Al
|
80ee34cc3a
|
[text] adding normalization with whitespace
|
2016-12-10 17:50:53 -05:00 |
|
Al
|
c0a468d7e8
|
[normalization] adding a normalize_token function and some token options for deleting periods
|
2016-12-09 17:46:26 -05:00 |
|
Al
|
dfa5c8e0a6
|
[abbreviations] Adding ability to abbreviate within hyphenated phrases e.g. Sint-Maarten => St.-Maarten
|
2016-08-24 18:50:24 -04:00 |
|
Al
|
97a2436ad7
|
[tokenization] Adding two more sets to token_types for punctuation and non-alphanumerics
|
2016-08-02 16:24:01 -04:00 |
|
Al
|
75d9c31395
|
[text] Adding NORMALIZE_STRING_COMPOSE constant in pynormalize.c
|
2016-07-24 03:37:43 -04:00 |
|
Al
|
7b3f4e9175
|
[text] Adding utils.py for is_numeric/is_numeric_strict
|
2016-07-24 03:37:11 -04:00 |
|
Al
|
b9ee3be806
|
[phrases] Using simple string encoding/decoding for default serialize/deserialize in PhraseFilter base class
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
771a360a85
|
[phrases] Using safe_encode/safe_decode as default trie serializer/deserializer
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
4a2d266230
|
[phrases] adding __init__ to base PhraseFilter
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
ee1aa564c4
|
[normalization] normalize tokens should not replace digits by default
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
1fd4fbb7a2
|
[normalization] Adding default token options for numbers so we split alpha from numeric tokens and don't normalize digits
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
d5dc34ec1d
|
[gazetteers] moving PHRASE to a token type
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
2e15db06dd
|
[text] making normalize_string directly callable from Python geodata
|
2016-01-21 02:07:46 -05:00 |
|
Al
|
fa32eacdd1
|
[phrases] Adding Python phrase filter from address_normalizer until a Python wrapper around libpostal's trie_search is available
|
2016-01-17 15:45:02 -05:00 |
|
Al
|
58e53cab1c
|
[scripts] Adding the tokenize/normalize wrappers directly into the internal geodata package so pypostal can be maintained in an independent repo
|
2016-01-12 13:29:31 -05:00 |
|