Logo
Explore Help
Sign In
tommy/libpostal
1
0
Fork 0
You've already forked libpostal
Code Issues Pull Requests Actions Packages Projects Releases Wiki Activity
5,190 Commits 2 Branches 0 Tags
999de2bf6a634fd2cb71437c4c18e0fdb65bfa40
Commit Graph

3 Commits

Author SHA1 Message Date
Al
434bbd4dc2 [fix] removing unused vars 2017-12-30 02:31:43 -05:00
Al
f1e6886536 [similarity/dedupe] adding options for acronym alignments and address phrase matches in Soft-TFIDF. Acronym alignments will give higher similarity to NYU vs. "New York University" whereas phrase matches would match known phrases that share the same canonical like "Cty Rd" vs. "C.R." vs. "County Road" within the Soft-TFIDF similarity calculation. 2017-12-29 02:39:49 -05:00
Al
b90c3dab4b [similarity/dedupe] adding Soft-TFIDF implementation with several different fallback qualifiers for the max-sim function (Damerau-Levenshtein and libpostal's new bucketed affine gap method for detecting abbreviations), but keeping Jaro-Winkler as the secondary similarity function in the final distance metric. Overall this should results in higher similarity values when one of the tokens may not quite match the pure secondary threshold in terms of Jaro-Winkler but may match on one of the other criteria. 2017-12-28 04:34:46 -05:00
Powered by Gitea Version: 1.24.6 Page: 90ms Template: 2ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API