This website requires JavaScript.
Explore
Help
Sign In
tommy
/
libpostal
Watch
1
Star
0
Fork
0
You've already forked libpostal
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
5,184
Commits
2
Branches
0
Tags
8a917d8594eccc90d773aa180593f23edcdd538a
Commit Graph
3 Commits
Author
SHA1
Message
Date
Al
4e32565746
[dedupe] fixing toponym matching for city-equivalents, adding the LIBPOSTAL_ADDRESS_ANY component in each function call so it can be removed as needed.
2017-12-30 18:05:46 -05:00
Al
6dff154a99
[api] adding APIs for getting default options and using a consistent naming convention
2017-12-29 17:48:54 -05:00
Al
098babfdee
[dedupe] adding the core pairwise deduping module which ties together most of the work on this branch. Includes simple phrase-aware exact deduping methods, with per-component variations as to whether e.g. a root expansion match counts as an exact duplicate or not (in a secondary unit, "No. 2" and "Apt 2" can be considered an exact match in English whereas we wouldn't want to make that kind of assumption for street e.g. "Park Ave" and "Park Pl"). The API is fairly low-level at present, and may require a few calls. Notably, we leave the TFIDF scores or other weighting schemes to the client. Since each component gets its own dupe classification, it leaves the door open for doing more specific checks around e.g. compound house numbers/ranges in the future.
2017-12-29 04:48:00 -05:00