Commit Graph

44 Commits

Author SHA1 Message Date
Al
06541f5911 [osm] Adding country_region tag to address formatter 2016-07-21 23:38:37 -04:00
Al
7d5d54bd29 [formatting] Territories use parent country's template insertion probabilities 2016-07-21 17:04:57 -04:00
Al
5075128ada [intersections] Adding places to intersection template, intersection phrase generator 2016-07-21 17:04:57 -04:00
Al
d88be7ef5d [fix] use simple language code if language_script cannot be found 2016-07-21 17:04:57 -04:00
Al
90467e9098 [fix] global formatter config 2016-07-21 17:04:57 -04:00
Al
16a91528d6 [fix] config key name 2016-07-21 17:04:57 -04:00
Al
308080f6ee [formatting] Moving language country overrides to formatter config so actual language is retained 2016-07-21 17:04:57 -04:00
Al
8c44a5d312 [fix] check for None in chain store query formatting 2016-07-21 17:04:57 -04:00
Al
448f010b22 [fix] Removing template types from AddressFormatter, just drop components as needed 2016-07-21 17:04:57 -04:00
Al
3017c22dc8 [fix] pivot keys in address formatter 2016-07-21 17:04:57 -04:00
Al
6fc6f9f591 [addresses] Adding address-level component dropout to AddressComponents (returns an ordering so the client formatter can potentially emit multiple addresses with different components dropped out). Adding PO box and category probabilities to config 2016-07-21 17:04:57 -04:00
Al
47fc88e0fc [formatting] Better postprocessing of address-formatting templates so any country can potentially include the more rare components like city_district/state_district, etc., formatted queries for category (+ place/address), chain (+ place/address), and intersection 2016-07-21 17:04:57 -04:00
Al
4bdcb98320 [fix] country_language 2016-07-21 17:04:57 -04:00
Al
b6b85afa2a [docs] Update usage 2016-07-21 17:04:57 -04:00
Al
a948e97fe6 [formatting] Adding conditional probabilities for template insertions (e.g. given that we have a floor number, increase the probability that unit number follows it) 2016-07-21 17:04:57 -04:00
Al
334f22a41c [formatting] New formatter config including random component component order changes and default/per-country admin component ordering 2016-07-21 17:04:57 -04:00
Al
35deb15a84 [fix] default option for Aliases.get 2016-07-21 17:04:57 -04:00
Al
52246e0cd0 [formatting] Defining some of the new tag names in AddressFormatter as well as insert_component which reparses the address formatter template and inserts a given components, removing it from an existing block if necessary 2016-07-21 17:04:57 -04:00
Al
b22fb669b9 [aliases] Adding get method for aliases 2016-07-21 17:04:57 -04:00
Al
c84f50e227 [aliases] packaging up field aliasing 2016-07-21 17:04:57 -04:00
Al
58e53cab1c [scripts] Adding the tokenize/normalize wrappers directly into the internal geodata package so pypostal can be maintained in an independent repo 2016-01-12 13:29:31 -05:00
Al
ab0a4e622d [formatting] Switching back over to OpenCageData 2015-12-03 18:03:21 -05:00
Al
d4b6450f19 [formatting] Not applying template replacements from address formatting by default 2015-11-30 16:11:13 -05:00
Al
6aa640b5f0 [fix] Moving is_in:country to lower priority 2015-11-23 12:36:05 -05:00
Al
f1b6620369 [osm/formatting] replacing keys with the highest priority so addr:* tags take precedence over is_in:* tags 2015-11-22 22:25:44 -05:00
Al
85667997cd [formatting] Adding city_district and state_district tags to address formatting templates where it makes sense. These will not be in all addresses, tags can be added and removed from the training data with certain probabilities 2015-11-20 12:24:51 -05:00
Al
0b74039a6a [formatting] Adding city_district as a separate format tag 2015-11-17 11:38:38 -05:00
Al
1c543a5271 [osm/formatting] Adding is_in tags to the address formatter as they're common in OSM, aliasing addr:district to state_district instead of suburb 2015-10-29 12:30:56 -04:00
Al
2d4b3a6e2f [parser/formatting] Appendinge suburb between the road line and any subsequent lines for all bottom-up address formats. Effectively inserts neighborhoods into our version without making the OpenCage formats overly verbose. Also fixing post-format replace with group capture 2015-10-24 01:31:35 -04:00
Al
336bfe32ca [osm/formatter] Switching back to OpenCageData repo 2015-10-21 16:34:24 -04:00
Al
e584745061 [formatting] Adding STATE_DISTRICT to formatter for things like counties 2015-10-14 15:10:18 -04:00
Al
40cf247655 [formatting] Constants for field names, a few options in format_address 2015-09-29 23:03:37 -04:00
Al
f29f2f091b [fix] PEBCAK 2015-09-27 22:49:27 -04:00
Al
93b3110a49 [fix] only commas and hyphens need to be eliminated at the end of phrases in untagged address formatting 2015-09-27 19:25:34 -04:00
Al
d3bfaf6b43 [osm/formatting] Fixing formatting tagged addresses with comma separated fields 2015-09-27 03:19:23 -04:00
Al
d512201e2c [fix] removing space from tokens in address formatting 2015-09-27 02:18:34 -04:00
Al
5b829cd5a7 [fix] blank values containing punctuation in formatting 2015-09-26 21:49:28 -04:00
Al
dac0440be8 [fix] rsplit 2015-09-26 21:07:54 -04:00
Al
ae93552455 [osm/formatting] Moving back to openvenues repo pending resolution of the Turkish address issue 2015-09-26 03:56:52 -04:00
Al
0c792a2cc3 [osm/formatting] Changing the way the formatter elimiates inter-component separators, changing repo back to OpenCageData after pull request merge 2015-09-26 03:21:26 -04:00
Al
646b9f7248 [osm/formatting] Continuing to use openvenues formatter for the India fix 2015-09-25 13:36:24 -04:00
Al
9901dd2aac [fix] Switching address formatter back to OpenCageData repo 2015-09-24 18:42:17 -04:00
Al
c85ce0b11d [osm/formatting] Tagging separators as well in tagged output of the address formatter 2015-09-24 01:22:49 -04:00
Al
84cf21df88 [osm] Separating address formatter into its own module, adding some documentation of the various training sets with examples 2015-09-20 20:05:46 -04:00