Logo
Explore Help
Sign In
tommy/libpostal
1
0
Fork 0
You've already forked libpostal
Code Issues Pull Requests Actions Packages Projects Releases Wiki Activity
7 Commits 2 Branches 0 Tags
585baab0a55cc2d4196e9730053a00161b528349
Commit Graph

7 Commits

This Branch
This Branch
All Branches
Author SHA1 Message Date
Al
585baab0a5 [phrases] optimized implementation of a double-array trie for storing millions of phrases compactly while being extremely quick to access. Supports utf-8, stores phrase tails in a contiguous character array separated by NUL bytes and stores offsets only so the chars at that offset can be treated as a regular C string and fed to things like strncmp. Also stores suffixes (primarily for languages like German, Dutch, etc. that concatenate street names e.g. Foobarstraße, Fobarweg) by prefixing the reversed string with the NUL byte and storing it backward in the trie, so can search forward and backward with the same data structure. 2015-03-03 13:18:18 -05:00
Al
3ed5795cff [fix] fixing some formatting 2015-03-03 12:54:27 -05:00
Al
087328c321 [utils] logging 2015-03-03 12:38:10 -05:00
Al
09552906d3 [utils] util headers 2015-03-03 12:37:32 -05:00
Al
0689f936c9 [tokenization] scanner/tokenizer (generated with re2c) 2015-03-03 12:35:22 -05:00
Al
5216aba1b6 [utils] string utils, file utils, contiguous arrays of strings used for storing tokenized strings, klib for generic hashtables and vectors, antirez's sds for certain types of string building, utf8proc for iterating over utf-8 strings and unicode normalization 2015-03-03 12:33:13 -05:00
Al Barrentine
27269e18ca Initial commit 2015-03-02 19:21:31 -05:00
Powered by Gitea Version: 1.24.6 Page: 24ms Template: 2ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API