This website requires JavaScript.
Explore
Help
Sign In
tommy
/
libpostal
Watch
1
Star
0
Fork
0
You've already forked libpostal
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
5,270
Commits
2
Branches
0
Tags
fd2b864b7d2d8bc126d8d2ab0cbab0e3efa459a6
Commit Graph
2 Commits
Author
SHA1
Message
Date
Al
6dff154a99
[api] adding APIs for getting default options and using a consistent naming convention
2017-12-29 17:48:54 -05:00
Al
098babfdee
[dedupe] adding the core pairwise deduping module which ties together most of the work on this branch. Includes simple phrase-aware exact deduping methods, with per-component variations as to whether e.g. a root expansion match counts as an exact duplicate or not (in a secondary unit, "No. 2" and "Apt 2" can be considered an exact match in English whereas we wouldn't want to make that kind of assumption for street e.g. "Park Ave" and "Park Pl"). The API is fairly low-level at present, and may require a few calls. Notably, we leave the TFIDF scores or other weighting schemes to the client. Since each component gets its own dupe classification, it leaves the door open for doing more specific checks around e.g. compound house numbers/ranges in the future.
2017-12-29 04:48:00 -05:00