Al
|
25e966411d
|
[formatting] adding the ability to invert the address template (line by line, preserving order within each line) with certain probabilities
|
2016-12-27 23:25:49 -05:00 |
|
Al
|
6eee689685
|
[fix] only applying separator tag to commas
|
2016-12-27 03:16:04 -05:00 |
|
Al
|
c3bf63bc18
|
[fix] remove reference to ftfy in the formatter
|
2016-12-26 21:25:28 -05:00 |
|
Al
|
7ec368542b
|
[formatting] giving single hyphens the separator tag
|
2016-12-26 21:00:25 -05:00 |
|
Al
|
ff32321425
|
[formatter] adding house_number_before_road method to AddressFormatter
|
2016-12-19 02:00:06 -05:00 |
|
Al
|
3617b3a10c
|
[fix] recursive merge for entries that are empty dictionaries
|
2016-11-16 02:19:07 -05:00 |
|
Al
|
f8664b0deb
|
[formatting] making regex-based tests during insert_component optional.If exact_order=True, insert the given component directly before/after the reference component, otherwise for components that already exist in the template only need to care about relative position. Adding a method to determine if template language is important for a particular country/language pair.
|
2016-10-12 14:42:34 -04:00 |
|
Al
|
2663b81670
|
[address_formatting] caching parsed templates from pystache yields about a 2.5x speedup per call, should shave off several hours of CPU time for large training sets
|
2016-10-11 15:36:49 -04:00 |
|
Al
|
1cec0570d6
|
[formatting] only using alias country insertions if the given country has not defined its own (e.g. look at Puerto Rico first, then use the US if there's nothing defined)
|
2016-09-15 11:45:46 -04:00 |
|
Al
|
55e9ab1978
|
[places] adding world_region tag and adding the phrase West Indies with small random probability for English-speaking Caribbean nations. Ref: #113
|
2016-09-11 21:54:56 -04:00 |
|
Al
|
9cdcd7f21a
|
[fix] indentation
|
2016-09-09 08:59:50 -04:00 |
|
Al
|
a14202fc7a
|
[fix] default value
|
2016-09-09 01:46:03 -04:00 |
|
Al
|
85ad3bf0f4
|
[formatting] allowing a non-default option for components that can be inserted between road and house number
|
2016-09-09 01:38:39 -04:00 |
|
Al
|
0edbe5a593
|
[formatting] don't allow insertions between house number and road name
|
2016-09-08 15:00:36 -04:00 |
|
Al
|
1e27ad1124
|
[metro stations] Adding metro station component to address formatter
|
2016-08-06 19:13:20 -04:00 |
|
Al
|
88353b75e0
|
[fix] more helpful error message if there are errors with the formatting config
|
2016-07-27 19:14:30 -04:00 |
|
Al
|
f8d185aaff
|
[osm/formatting] Tag commas in a given labeld component with the SEP tag so e.g. concatenated districts can be counted as separate phrases
|
2016-07-27 16:13:57 -04:00 |
|
Al
|
06541f5911
|
[osm] Adding country_region tag to address formatter
|
2016-07-21 23:38:37 -04:00 |
|
Al
|
7d5d54bd29
|
[formatting] Territories use parent country's template insertion probabilities
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
5075128ada
|
[intersections] Adding places to intersection template, intersection phrase generator
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
d88be7ef5d
|
[fix] use simple language code if language_script cannot be found
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
90467e9098
|
[fix] global formatter config
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
16a91528d6
|
[fix] config key name
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
308080f6ee
|
[formatting] Moving language country overrides to formatter config so actual language is retained
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
8c44a5d312
|
[fix] check for None in chain store query formatting
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
448f010b22
|
[fix] Removing template types from AddressFormatter, just drop components as needed
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
3017c22dc8
|
[fix] pivot keys in address formatter
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
6fc6f9f591
|
[addresses] Adding address-level component dropout to AddressComponents (returns an ordering so the client formatter can potentially emit multiple addresses with different components dropped out). Adding PO box and category probabilities to config
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
47fc88e0fc
|
[formatting] Better postprocessing of address-formatting templates so any country can potentially include the more rare components like city_district/state_district, etc., formatted queries for category (+ place/address), chain (+ place/address), and intersection
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
4bdcb98320
|
[fix] country_language
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
b6b85afa2a
|
[docs] Update usage
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
a948e97fe6
|
[formatting] Adding conditional probabilities for template insertions (e.g. given that we have a floor number, increase the probability that unit number follows it)
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
334f22a41c
|
[formatting] New formatter config including random component component order changes and default/per-country admin component ordering
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
52246e0cd0
|
[formatting] Defining some of the new tag names in AddressFormatter as well as insert_component which reparses the address formatter template and inserts a given components, removing it from an existing block if necessary
|
2016-07-21 17:04:57 -04:00 |
|
Al
|
58e53cab1c
|
[scripts] Adding the tokenize/normalize wrappers directly into the internal geodata package so pypostal can be maintained in an independent repo
|
2016-01-12 13:29:31 -05:00 |
|
Al
|
ab0a4e622d
|
[formatting] Switching back over to OpenCageData
|
2015-12-03 18:03:21 -05:00 |
|
Al
|
d4b6450f19
|
[formatting] Not applying template replacements from address formatting by default
|
2015-11-30 16:11:13 -05:00 |
|
Al
|
6aa640b5f0
|
[fix] Moving is_in:country to lower priority
|
2015-11-23 12:36:05 -05:00 |
|
Al
|
f1b6620369
|
[osm/formatting] replacing keys with the highest priority so addr:* tags take precedence over is_in:* tags
|
2015-11-22 22:25:44 -05:00 |
|
Al
|
85667997cd
|
[formatting] Adding city_district and state_district tags to address formatting templates where it makes sense. These will not be in all addresses, tags can be added and removed from the training data with certain probabilities
|
2015-11-20 12:24:51 -05:00 |
|
Al
|
0b74039a6a
|
[formatting] Adding city_district as a separate format tag
|
2015-11-17 11:38:38 -05:00 |
|
Al
|
1c543a5271
|
[osm/formatting] Adding is_in tags to the address formatter as they're common in OSM, aliasing addr:district to state_district instead of suburb
|
2015-10-29 12:30:56 -04:00 |
|
Al
|
2d4b3a6e2f
|
[parser/formatting] Appendinge suburb between the road line and any subsequent lines for all bottom-up address formats. Effectively inserts neighborhoods into our version without making the OpenCage formats overly verbose. Also fixing post-format replace with group capture
|
2015-10-24 01:31:35 -04:00 |
|
Al
|
336bfe32ca
|
[osm/formatter] Switching back to OpenCageData repo
|
2015-10-21 16:34:24 -04:00 |
|
Al
|
e584745061
|
[formatting] Adding STATE_DISTRICT to formatter for things like counties
|
2015-10-14 15:10:18 -04:00 |
|
Al
|
40cf247655
|
[formatting] Constants for field names, a few options in format_address
|
2015-09-29 23:03:37 -04:00 |
|
Al
|
f29f2f091b
|
[fix] PEBCAK
|
2015-09-27 22:49:27 -04:00 |
|
Al
|
93b3110a49
|
[fix] only commas and hyphens need to be eliminated at the end of phrases in untagged address formatting
|
2015-09-27 19:25:34 -04:00 |
|
Al
|
d3bfaf6b43
|
[osm/formatting] Fixing formatting tagged addresses with comma separated fields
|
2015-09-27 03:19:23 -04:00 |
|
Al
|
d512201e2c
|
[fix] removing space from tokens in address formatting
|
2015-09-27 02:18:34 -04:00 |
|