minor typo

minor typo
This commit is contained in:
Peter Johnson
2018-08-12 19:22:21 +02:00
committed by GitHub
parent eae01d3e60
commit 1681321452

View File

@@ -494,7 +494,7 @@ language (IX => 9) which occur in the names of many monarchs, popes, etc.
- **Fast, accurate tokenization/lexing**: clocked at > 1M tokens / sec,
implements the TR-29 spec for UTF8 word segmentation, tokenizes East Asian
languages chracter by character instead of on whitespace.
languages character by character instead of on whitespace.
- **UTF8 normalization**: optionally decompose UTF8 to NFD normalization form,
strips accent marks e.g. à => a and/or applies Latin-ASCII transliteration.