This website requires JavaScript.
Explore
Help
Sign In
tommy
/
libpostal
Watch
1
Star
0
Fork
0
You've already forked libpostal
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
Files
3d3aacae67f820299ea157c601c847cf33b49b28
libpostal
/
scripts
/
geodata
/
addresses
History
Al
317d3aa9ed
[addresses] PO Box phrase generator
2016-07-21 17:04:57 -04:00
..
__init__.py
[addresses] address config class for general sampling of forms specified in the address configs (default/alternatives to choose a phrase, canonical/abbreviated/sample to choose an abbreviation or surface form for that phrase)
2016-07-21 17:04:57 -04:00
config.py
[addresses] address config class for general sampling of forms specified in the address configs (default/alternatives to choose a phrase, canonical/abbreviated/sample to choose an abbreviation or surface form for that phrase)
2016-07-21 17:04:57 -04:00
conjunctions.py
[addresses] conjunction class for building phrases like "5th and 6th" or "Units 1 & 2" across languages using the address configs
2016-07-21 17:04:57 -04:00
numbering.py
[addresses] base class for numbered components (floors, units, house numbers in some languages/countries). Can generate many variants of a number (e.g. Floor 2, 2nd Floor, Floor
#2
, Floor No. 2, etc.)
2016-07-21 17:04:57 -04:00
po_box.py
[addresses] PO Box phrase generator
2016-07-21 17:04:57 -04:00
sampling.py
[addresses] utilities for sampling from an arbitrary discrete distribution, building cumulative distributions, and sampling from a Zipfian distribution which seems to be a reasonable way of generating plausible apartment/floor numbers when the height/number of units is unknown. Picking a letter uniformly at random means P('Unit A') == P('Unit Z') when 'A' should be much more likely. Sampling from a Zipfian gets the desired effect in situations where address components are numbered by "counting from 0/1/A" while still allowing for a long tail
2016-07-21 17:04:57 -04:00