e21b793b03[polygons] Adding ISO3166 alpha 2/3 codes to OSM polygons index
Al
2016-04-28 14:23:38 -04:00
7e5ecb30cf[addresses] sample_alphabet (Zipfian) in PO box rather than a uniform choice
Al
2016-04-28 13:07:16 -04:00
3845c58ca3[points] Adding load method for point reverse geocoding
Al
2016-04-28 13:05:39 -04:00
c506649252[fix] languages_intialized
Al
2016-04-28 13:04:38 -04:00
1fd4fbb7a2[normalization] Adding default token options for numbers so we split alpha from numeric tokens and don't normalize digits
Al
2016-04-28 13:03:16 -04:00
3d765e9eca[addresses] Fixing direction_probability, adding ability to have phrases which only apply to numbers, adding the possibility of null phrases to non-numeric "numbers" e.g. A-Z, etc.
Al
2016-04-28 13:01:41 -04:00
ac00f294c0[requirements] Adding numpy to Python repo's requirements (only needed for building libpostal, not for using it)
Al
2016-04-28 12:58:59 -04:00
03704fff6a[intersections] Lower memory version of intersection freader
Al
2016-04-28 12:58:36 -04:00
620f0594aa[points] haversine distance in a different method
Al
2016-04-27 17:27:30 -04:00
d5dc34ec1d[gazetteers] moving PHRASE to a token type
Al
2016-04-27 15:11:38 -04:00
04a5a9e611[fix] Removing YAML inheritance as it doesn't merge nested dictionaries
Al
2016-04-27 15:10:08 -04:00
f3bbe2ee74[fix] file rename
Al
2016-04-27 02:21:15 -04:00
9f37a26a6d[points] Adding point reverse geocoding index
Al
2016-04-27 01:43:47 -04:00
9977a7a254[mv] Moving osm_admin_boundaries to just admin_boundaries
Al
2016-04-27 00:10:42 -04:00
37747709ee[addresses] Using YAML inheritance instead of baking it into the config parser
Al
2016-04-26 18:29:05 -04:00
cd10951afb[addresses] Generalizing the functions used for address configs so they can be reused for per-country OSM configs, etc.
Al
2016-04-26 18:27:12 -04:00
79368f3f02[intersections] Intersections generator for OSM
Al
2016-04-26 18:24:50 -04:00
799bbe4912[neighborhoods] Moving neighborhoods index to its own package
Al
2016-04-26 14:26:46 -04:00
2a37837412[dictionaries] preescolar
Al
2016-04-23 14:24:34 -04:00
97f2f8d160[dictionaries] parcela
Al
2016-04-23 14:24:18 -04:00
55d2f33f94[fix] most frequently occurring form for Auntie Anne's
Al
2016-04-23 13:35:44 -04:00
8aac200d74[addresses] config for phrases around postcodes like CP in Spanish
Al
2016-04-23 12:37:04 -04:00
f070697066[addresses] PO Box config
Al
2016-04-23 12:36:16 -04:00
5bbb60e241[fix] instance var
Al
2016-04-23 12:35:09 -04:00
3fd73c0bc8[fix] import
Al
2016-04-23 12:01:17 -04:00
a7fe6408c0[addresses] /po_box/po_boxes/
Al
2016-04-20 17:07:27 -04:00
1e107f09ab[addresses] Generate house number related phrases
Al
2016-04-20 17:06:30 -04:00
62748b4644[dictionaries] /house_number/house_numbers/
Al
2016-04-20 15:56:48 -04:00
90c88a3a24[fix] None handling and number dictionaries
Al
2016-04-20 14:58:57 -04:00
e13c536b03[addresses] different dictionaries for sampling cardinal/unit directions, not converting None to a string
Al
2016-04-19 17:05:10 -04:00
8688812e71[addresses] Updating English config to support new options for occasionally adding whitespace between unit numbers
Al
2016-04-19 17:03:46 -04:00
7f3667caf8[dictionaries] Removing ambiguous abbreviations for flat
Al
2016-04-19 17:01:53 -04:00
c47762b91c[addresses] Unit/apartment number generation
Al
2016-04-19 17:01:24 -04:00
ca68391ea6[addresses] sample positive floors
Al
2016-04-19 16:59:16 -04:00
9f652591ad[mv] Moving sampling to math.sampling
Al
2016-04-19 11:57:42 -04:00
93df047f8c[addresses] Adding more numeric/numeric_affix probabilities to English config
Al
2016-04-19 11:25:12 -04:00
32b6217aa8[addresses] Conjunction can be subclassed
Al
2016-04-19 11:22:13 -04:00
535453f77d[addresses] Adding ability to randomly append relative/cardinal directions
Al
2016-04-19 11:21:23 -04:00
f026e8a764[addresses] Adding base class for numeric phrases (appended to a number using numeric/numeric_affix), using probability 1.0 if only one of numeric/numeric_affix/ordinal is specified
Al
2016-04-19 11:07:25 -04:00
efc40c5698[fix] polygons
Al
2016-04-19 10:15:31 -04:00
c7ea5d9637[fix] typo
Al
2016-04-19 00:53:39 -04:00
cc17d8c15d[dictionaries] Updates to Spanish dictionaries, casa can be a numbered unit type
Al
2016-04-19 00:45:32 -04:00
5dcc7130d2[dictionaries] Updates to English dictionaries
Al
2016-04-19 00:44:33 -04:00
0a80ec7129[polygons] Adding __iter__ and __len__ to polygon index and keeping track of the number of polygons for iteration
Al
2016-04-19 00:42:50 -04:00
9328883a61[addresses] Combined unit + house number (32/4, etc.) is more common in Canada, Australia, Singapore, etc. Not as much in the US, UK
Al
2016-04-18 17:05:55 -04:00
848b7ac167[addresses] changing plurals to use the standard probability structure
Al
2016-04-18 15:12:59 -04:00
d0fb0d413d[dictionaries] Updates to Spanish dictionaries to support the new structure, new abbreviations for Colombia, etc.
Al
2016-04-15 14:21:43 -04:00
f7764b70cd[addresses] implementing null_probability (raw number, no phrase), orindal genders, and direction_probability
Al
2016-04-15 03:25:41 -04:00
22687323c2[numbers] suffixed_number
Al
2016-04-15 02:04:58 -04:00
6d4e54cd7a[dictionaries] making entrances/postcodes plural for consistency
Al
2016-04-15 01:10:03 -04:00
410eb0006a[dictionaries] Moving intersections to cross streets
Al
2016-04-14 17:53:27 -04:00
2f9a58f37b[expansion] Add postcode dictionary to gazetteer types
Al
2016-04-14 14:33:02 -04:00
b5386eb601[addresses] generator for floor numbers as well as special aliases like basement, mezzanine, etc. using the address configs
Al
2016-04-14 14:22:08 -04:00
e1f1e34dca[expansion] Modifying the Python gazetteers to use new dictionaries API
Al
2016-04-14 14:17:09 -04:00
80089099e9[expansion] Adding number and intersections to dictionary types
Al
2016-04-14 14:15:33 -04:00
3d3aacae67[addresses] Adding abbreviations as a separate module so it can be used with multiple data sets
Al
2016-04-14 03:09:58 -04:00
317d3aa9ed[addresses] PO Box phrase generator
Al
2016-04-14 02:38:45 -04:00
21a2c067f5[addresses] PO Box fixes in the address config
Al
2016-04-14 02:38:02 -04:00
9c4348a990[addresses] conjunction class for building phrases like "5th and 6th" or "Units 1 & 2" across languages using the address configs
Al
2016-04-14 01:21:42 -04:00
d136fb7576[addresses] base class for numbered components (floors, units, house numbers in some languages/countries). Can generate many variants of a number (e.g. Floor 2, 2nd Floor, Floor #2, Floor No. 2, etc.)
Al
2016-04-14 01:17:43 -04:00
14c89e6895[addresses] utilities for sampling from an arbitrary discrete distribution, building cumulative distributions, and sampling from a Zipfian distribution which seems to be a reasonable way of generating plausible apartment/floor numbers when the height/number of units is unknown. Picking a letter uniformly at random means P('Unit A') == P('Unit Z') when 'A' should be much more likely. Sampling from a Zipfian gets the desired effect in situations where address components are numbered by "counting from 0/1/A" while still allowing for a long tail
Al
2016-04-14 01:13:39 -04:00
dcabdf7c0b[addresses] address config class for general sampling of forms specified in the address configs (default/alternatives to choose a phrase, canonical/abbreviated/sample to choose an abbreviation or surface form for that phrase)
Al
2016-04-14 01:06:51 -04:00
a8ad7c9dbf[addresses] Adding corner_of key to the English address config
Al
2016-04-14 01:04:01 -04:00
be704b7078[dictionaries] Intersections dictionary for English
Al
2016-04-14 00:57:09 -04:00
fa0076e786[addresses] Additions to the English address config
Al
2016-04-14 00:56:39 -04:00
d4e2653866[boundaries/fix] admin_level 7 in Australia should map to city, not state_district
Al
2016-04-13 18:27:29 -04:00
1d14bf6e6e[dictionaries] Making the word for "number" a separate dictionary as it can apply in several places
Al
2016-04-13 18:27:04 -04:00
da7a3b721a[addresses] Adding probabilities to the English address configs
Al
2016-04-11 23:25:16 -04:00
9a0ea19d02[polygons] Persistent polygons for neighborhoods index as well, cache size at 100k
Al
2016-04-11 01:24:45 -04:00
90142e8559[polygons] neighborhoods repo has the correct polygons for NYC, removing the pediacities version
Al
2016-04-10 20:27:38 -04:00
c570bb7aef[fix] priorities in neighborhood index
Al
2016-04-10 18:56:01 -04:00
e87c216241[fix] var name
Al
2016-04-10 14:02:51 -04:00
ab1a8d4416[fix] Fixes to Zetashapes reverse geocoder
Al
2016-04-10 14:01:43 -04:00
a93f110112[fix] moving methods
Al
2016-04-09 21:35:17 -04:00
efd167323b[polygons/neighborhoods] refactoring Zetashapes download, adding in PediaCities polygons for NYC neighborhoods
Al
2016-04-09 21:32:39 -04:00
333bd7ef45[polygons] refactoring methods for getting cached/non-cached polygons
Al
2016-04-09 19:52:48 -04:00
e4ff4a28b1[polygons] Quattroshapes neighborhoods use regular in-memory polygons
Al
2016-04-09 19:28:56 -04:00
67b3eadbd5[dictionaries] Spanish abbreviations for numero
Al
2016-04-09 15:18:46 -04:00
8456340e0c[dictionaries] adding abbreviations for Hong Kong/Kowloon/New Territories
Al
2016-04-09 15:17:37 -04:00
e6b9b78924[dictionaries] Adding a few English abbreviations/expansions
Al
2016-04-09 14:53:33 -04:00
3bd61cd3c2[numex] string keys
Al
2016-04-08 18:13:08 -04:00
9dd5d5c210[dictionaries] encapsulating reading address dictionaries so it's easy to implement sampling for the address training data
Al
2016-04-08 18:12:30 -04:00
23525df39d[numex] Nicer API for ordinal suffixes
Al
2016-04-08 17:10:10 -04:00
0f0af1f295[osm/polygons] Adding properties in building polygons
Al
2016-04-08 12:33:40 -04:00
e24306701f[numex] Moving numex files to YAML as well
Al
2016-04-07 13:26:00 -04:00
76fc337d0e[osm/polygons] add building:part to building polygons
Al
2016-04-07 13:15:42 -04:00
72ee2e00ae[osm] Moving OSM boundaries to YAML files instead of JSON for consistency
Al
2016-04-06 22:59:46 -04:00
6a03b0376c[osm/polygons] Using greater simplify tolerance
Al
2016-04-06 20:24:37 -04:00
ae62471d32[fix] simplify_polygons in building geocoder, and adding caching back to OSM admin polygons as it's faster when taking into account startup time. Also adding a few properties to buildings and landuse polygons
Al
2016-04-06 13:53:47 -04:00
1f52f8ddcc[osm/polygons] Same check for closed ways as for relations in OSM polygon readers
Al
2016-04-06 01:35:36 -04:00
26ada5cdbb[osm/polygons] From benchmarking it seems to make sense to keep OSM polygons in memory after all
Al
2016-04-05 23:25:45 -04:00