This website requires JavaScript.
Explore
Help
Sign In
tommy
/
libpostal
Watch
1
Star
0
Fork
0
You've already forked libpostal
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
4,844
Commits
2
Branches
0
Tags
1f1dbe25e1621aff3283cf3a13370af4a47d5cd4
Commit Graph
3 Commits
Author
SHA1
Message
Date
Al
1f1dbe25e1
[test] adding a number of user-contributed test cases from Moz in
#21
. Almost all are working under the CRF parser trained on 10% of the data. There are a few problematic ones in the UK still that have been omitted here. We currently don't correctly format the training data for locailty + postal town pattern, which are both considered "city" by libpostal and thus one will usually get lumped in with the road or something like that. There may also be some utility in modelling comma usage (training data has commas, but they're ignored by the parser both at train and run time - might be useful to train on them but drop out randomly so the parser doesn't become too dependent on having them)
2017-03-21 03:08:09 -04:00
Al
b8a12e0517
[test] adding parser test cases in 22 countries. These may change, and I'm generlaly against putting every obscure test case in the world in here. It's better to measure accuracy in aggregate statistics instead of individual test cases (i.e. if a particular change to the parser improves overall performance but fails one test case, should we accept the improvement?) The thought here is: these represent parses that are used in documentation/examples, as well as most of those that have been brought up in Github issues from the initial release, and we want these specific tests to work from build to build. If a model fails one of these test cases, it shouldn't get pushed to our users.
2017-03-20 00:58:52 -04:00
Al
37cfe8ab3b
[test] Adding automated parser tests to the C library
2016-02-17 17:19:10 -05:00