907c8fe96d[addresses] /po_box/po_boxes/
Al
2016-04-20 17:07:27 -04:00
6ff0b25f40[addresses] Generate house number related phrases
Al
2016-04-20 17:06:30 -04:00
1eeda65cfd[dictionaries] /house_number/house_numbers/
Al
2016-04-20 15:56:48 -04:00
dba8be445d[fix] None handling and number dictionaries
Al
2016-04-20 14:58:57 -04:00
901f720368[addresses] different dictionaries for sampling cardinal/unit directions, not converting None to a string
Al
2016-04-19 17:05:10 -04:00
d91735c3c2[addresses] Updating English config to support new options for occasionally adding whitespace between unit numbers
Al
2016-04-19 17:03:46 -04:00
10320723b1[dictionaries] Removing ambiguous abbreviations for flat
Al
2016-04-19 17:01:53 -04:00
38ec82a42b[addresses] Unit/apartment number generation
Al
2016-04-19 17:01:24 -04:00
1acf0d592b[addresses] sample positive floors
Al
2016-04-19 16:59:16 -04:00
868fcb752b[mv] Moving sampling to math.sampling
Al
2016-04-19 11:57:42 -04:00
c31926f3dd[addresses] Adding more numeric/numeric_affix probabilities to English config
Al
2016-04-19 11:25:12 -04:00
ce2b2d9559[addresses] Conjunction can be subclassed
Al
2016-04-19 11:22:13 -04:00
c92af0da78[addresses] Adding ability to randomly append relative/cardinal directions
Al
2016-04-19 11:21:23 -04:00
450aee95c2[addresses] Adding base class for numeric phrases (appended to a number using numeric/numeric_affix), using probability 1.0 if only one of numeric/numeric_affix/ordinal is specified
Al
2016-04-19 11:07:25 -04:00
1b2e92dc14[fix] polygons
Al
2016-04-19 10:15:31 -04:00
9abc679f09[fix] typo
Al
2016-04-19 00:53:39 -04:00
ccbbf84e8d[dictionaries] Updates to Spanish dictionaries, casa can be a numbered unit type
Al
2016-04-19 00:45:32 -04:00
b8125a232d[dictionaries] Updates to English dictionaries
Al
2016-04-19 00:44:33 -04:00
47ffd18c8c[polygons] Adding __iter__ and __len__ to polygon index and keeping track of the number of polygons for iteration
Al
2016-04-19 00:42:50 -04:00
9271fda30e[addresses] Combined unit + house number (32/4, etc.) is more common in Canada, Australia, Singapore, etc. Not as much in the US, UK
Al
2016-04-18 17:05:55 -04:00
d88f130edf[addresses] changing plurals to use the standard probability structure
Al
2016-04-18 15:12:59 -04:00
af3fc30632[docs] Adding note about Rstats binding to the README
Al
2016-04-16 13:56:20 -04:00
7272d44575[dictionaries] Updates to Spanish dictionaries to support the new structure, new abbreviations for Colombia, etc.
Al
2016-04-15 14:21:43 -04:00
2a570481ba[addresses] implementing null_probability (raw number, no phrase), orindal genders, and direction_probability
Al
2016-04-15 03:25:41 -04:00
430ad2e187[numbers] suffixed_number
Al
2016-04-15 02:04:58 -04:00
028dbacc87[dictionaries] making entrances/postcodes plural for consistency
Al
2016-04-15 01:10:03 -04:00
883ef2ec56[dictionaries] Moving intersections to cross streets
Al
2016-04-14 17:53:27 -04:00
5850793768[expansion] Add postcode dictionary to gazetteer types
Al
2016-04-14 14:33:02 -04:00
6babbfaf02[addresses] generator for floor numbers as well as special aliases like basement, mezzanine, etc. using the address configs
Al
2016-04-14 14:22:08 -04:00
36b3d515ad[expansion] Modifying the Python gazetteers to use new dictionaries API
Al
2016-04-14 14:17:09 -04:00
2ff4940e36[expansion] Adding number and intersections to dictionary types
Al
2016-04-14 14:15:33 -04:00
49b02796c0[addresses] Adding abbreviations as a separate module so it can be used with multiple data sets
Al
2016-04-14 03:09:58 -04:00
a6553b77d3[addresses] PO Box phrase generator
Al
2016-04-14 02:38:45 -04:00
9eb444b193[addresses] PO Box fixes in the address config
Al
2016-04-14 02:38:02 -04:00
d29ade7210[addresses] conjunction class for building phrases like "5th and 6th" or "Units 1 & 2" across languages using the address configs
Al
2016-04-14 01:21:42 -04:00
f0ac3522da[addresses] base class for numbered components (floors, units, house numbers in some languages/countries). Can generate many variants of a number (e.g. Floor 2, 2nd Floor, Floor #2, Floor No. 2, etc.)
Al
2016-04-14 01:17:43 -04:00
fe006e0d62[addresses] utilities for sampling from an arbitrary discrete distribution, building cumulative distributions, and sampling from a Zipfian distribution which seems to be a reasonable way of generating plausible apartment/floor numbers when the height/number of units is unknown. Picking a letter uniformly at random means P('Unit A') == P('Unit Z') when 'A' should be much more likely. Sampling from a Zipfian gets the desired effect in situations where address components are numbered by "counting from 0/1/A" while still allowing for a long tail
Al
2016-04-14 01:13:39 -04:00
58feeab714[addresses] address config class for general sampling of forms specified in the address configs (default/alternatives to choose a phrase, canonical/abbreviated/sample to choose an abbreviation or surface form for that phrase)
Al
2016-04-14 01:06:51 -04:00
518140a1b5[addresses] Adding corner_of key to the English address config
Al
2016-04-14 01:04:01 -04:00
db9d51e655[dictionaries] Intersections dictionary for English
Al
2016-04-14 00:57:09 -04:00
8fdd3e9314[addresses] Additions to the English address config
Al
2016-04-14 00:56:39 -04:00
e37431912d[boundaries/fix] admin_level 7 in Australia should map to city, not state_district
Al
2016-04-13 18:27:29 -04:00
7bb5da94bb[dictionaries] Making the word for "number" a separate dictionary as it can apply in several places
Al
2016-04-13 18:27:04 -04:00
da561fd9e3[addresses] Adding probabilities to the English address configs
Al
2016-04-11 23:25:16 -04:00
59e5fcd1b4[fix] LC_ALL=C in data download script
Al
2016-04-11 12:47:50 -04:00
7332445525[polygons] Persistent polygons for neighborhoods index as well, cache size at 100k
Al
2016-04-11 01:24:45 -04:00
e6dcf975f6[polygons] neighborhoods repo has the correct polygons for NYC, removing the pediacities version
Al
2016-04-10 20:27:38 -04:00
f739b46d6d[fix] priorities in neighborhood index
Al
2016-04-10 18:56:01 -04:00
761413e723[fix] var name
Al
2016-04-10 14:02:51 -04:00
83fcf39d49[fix] Fixes to Zetashapes reverse geocoder
Al
2016-04-10 14:01:43 -04:00
ef72ad592b[fix] moving methods
Al
2016-04-09 21:35:17 -04:00
dee143798a[polygons/neighborhoods] refactoring Zetashapes download, adding in PediaCities polygons for NYC neighborhoods
Al
2016-04-09 21:32:39 -04:00
38b39887ec[polygons] refactoring methods for getting cached/non-cached polygons
Al
2016-04-09 19:52:48 -04:00
78924fa308[polygons] Quattroshapes neighborhoods use regular in-memory polygons
Al
2016-04-09 19:28:56 -04:00
bcf87574d4[dictionaries] Spanish abbreviations for numero
Al
2016-04-09 15:18:46 -04:00
5d182e30d4[dictionaries] adding abbreviations for Hong Kong/Kowloon/New Territories
Al
2016-04-09 15:17:37 -04:00
2d0a0f1c83[dictionaries] Adding a few English abbreviations/expansions
Al
2016-04-09 14:53:33 -04:00
26581aeb4d[numex] string keys
Al
2016-04-08 18:13:08 -04:00
d38de71854[dictionaries] encapsulating reading address dictionaries so it's easy to implement sampling for the address training data
Al
2016-04-08 18:12:30 -04:00
02e82e5342[numex] Nicer API for ordinal suffixes
Al
2016-04-08 17:10:10 -04:00
737b5d06ed[osm/polygons] Adding properties in building polygons
Al
2016-04-08 12:33:40 -04:00
3bc85db41e[numex] Moving numex files to YAML as well
Al
2016-04-07 13:26:00 -04:00
5fce5e8000[osm/polygons] add building:part to building polygons
Al
2016-04-07 13:15:42 -04:00
778fba2451[osm] Moving OSM boundaries to YAML files instead of JSON for consistency
Al
2016-04-06 22:59:46 -04:00
f2f131661a[osm/polygons] Using greater simplify tolerance
Al
2016-04-06 20:24:37 -04:00
69ef201cf1[fix] simplify_polygons in building geocoder, and adding caching back to OSM admin polygons as it's faster when taking into account startup time. Also adding a few properties to buildings and landuse polygons
Al
2016-04-06 13:53:47 -04:00
502c61d9db[osm/polygons] Same check for closed ways as for relations in OSM polygon readers
Al
2016-04-06 01:35:36 -04:00
984cdc0650[osm/polygons] From benchmarking it seems to make sense to keep OSM polygons in memory after all
Al
2016-04-05 23:25:45 -04:00
fbebcc11d0[fix] properties/polygon key split
Al
2016-04-05 22:47:48 -04:00
ee160c715b[osm/polygons] Trying persistent polygons again on OSM/Quattroshapes to test the new settings
Al
2016-04-05 19:46:45 -04:00
b8ccb8bfa1[osm/polygons] Storing polygon JSON under a different key so it doesn't have to be read from disk after a successful cache matched point-in-polygon test just to retrieve the properties
Al
2016-04-05 19:45:44 -04:00
a8ea5f47c3[fix] var name
Al
2016-04-05 19:23:08 -04:00
65e0067ed0[fix] classmethod for loading polygons
Al
2016-04-05 19:20:12 -04:00
a8b0114871[osm/polygons] Keep OSM/Quattroshapes admin polygons in memory as there are fewer of them and they are large
Al
2016-04-05 19:05:26 -04:00
b693fe11dd[fix] double prep
Al
2016-04-05 18:49:52 -04:00
136700fa7f[fix] return_all in polygon index
Al
2016-04-05 18:42:20 -04:00
e242868fd9[osm/polygons] Keep stats on cache hits/misses for testing cache sizes
Al
2016-04-05 16:46:14 -04:00
ec29c36cbc[build] Adding lru-dict, a fast C LRU cache, to requirements.txt for geodata package
Al
2016-04-05 14:55:35 -04:00
004165d184[osm/polygons] Using an LRU cache for prepped polygons in the various PolygonIndex subclasses. That way can store less simplified polygons but keep frequently accessed ones (like countries) in memory
Al
2016-04-05 14:53:07 -04:00
01567d2672[osm/boundaries] admin_level 10 in Spain = suburb
Al
2016-04-05 01:24:26 -04:00
1af5b88922[fix] name
Al
2016-04-05 00:51:01 -04:00
49498ccf81[fix] import
Al
2016-04-04 23:38:30 -04:00
6bb6ddb06a[fix] arg name
Al
2016-04-04 22:41:20 -04:00