[tokenization/trie] simpler url regex reduces the scanner file size, accounting for a few more variations in word tokens, making trie suffix search use iteration instead of malloc'ing a new string

This commit is contained in:
Al
2015-04-05 16:30:27 -04:00
parent 5f3d74de18
commit 79fd7a8ded
4 changed files with 155806 additions and 224455 deletions

380119
src/scanner.c

File diff suppressed because it is too large Load Diff