Al
|
d4de170c94
|
[openaddresses] adding city of Monroe, MI
|
2017-02-24 13:57:57 -05:00 |
|
Al
|
d0679294bf
|
[openaddresses] adding positional args so OpenAddresses ingestion can be run only for specific countries, subdirs, or individual files.
|
2017-02-24 03:40:09 -05:00 |
|
Al
|
e39d4d2f00
|
[parser] check for non-null prev/prev2 before creating tag-based features
|
2017-02-24 02:57:16 -05:00 |
|
Al
|
182d60b623
|
[fix] removing include
|
2017-02-23 22:45:03 -05:00 |
|
Al
|
6097eacfef
|
[fix] ignore fields in Kauai containing \n
|
2017-02-23 16:34:34 -05:00 |
|
Al
|
033e8dbb58
|
[openaddresses] adding Kauai and some component additions for Maui
|
2017-02-23 16:26:50 -05:00 |
|
Al
|
fa7446deb6
|
[fix] district field for Wuhan data set
|
2017-02-23 02:15:55 -05:00 |
|
Al
|
f006bba345
|
[openaddresses] adding city of Medellín, Colombia
|
2017-02-22 19:01:26 -08:00 |
|
Al
|
2d59450a51
|
[openaddresses] adding new Oregon counties
|
2017-02-22 09:59:20 -08:00 |
|
Al
|
79c2429bba
|
[addresses] strip phrases like "# 123" off of English street names if they follow a thoroughfare/post-directional phrase whose expansion does not contain highway/route
|
2017-02-22 09:51:43 -08:00 |
|
Al
|
de05292b66
|
[openaddresses] Del Norte Couty, CA
|
2017-02-21 01:19:46 -08:00 |
|
Al
|
93768b7ba5
|
[openaddresses] Eaton County and Tecumseh, MI
|
2017-02-21 01:17:54 -08:00 |
|
Al
|
08c6831729
|
[openaddresses] LBC
|
2017-02-21 01:12:50 -08:00 |
|
Al
|
a2fcac4909
|
[openaddresses] city of Flower Mound, TX
|
2017-02-21 01:09:06 -08:00 |
|
Al
|
1d705e80da
|
[openaddresses] adding new BC district data sets
|
2017-02-21 01:07:47 -08:00 |
|
Al
|
6a079e86b3
|
[fix] using size_t instead of int in address_parser/address_parser_train
|
2017-02-20 19:22:13 -08:00 |
|
Al
|
8ea5405c20
|
[parser] using separate arrays for features requiring tag history and making the tagger responsible for those features so the feature function does not require passing in prev and prev2 explicitly (i.e. don't need to run the feature function multiple times if using global best-sequence prediction)
|
2017-02-19 14:21:58 -08:00 |
|
Al
|
ae85e3c0a0
|
[openaddresses] adding Warren County, OH
|
2017-02-19 14:03:24 -08:00 |
|
Al
|
715520f681
|
[parser] using new zeros API in averaged_perceptron.c
|
2017-02-19 14:02:54 -08:00 |
|
Al
|
5444b722cb
|
[addresses] do not exclude # from sampling in Spanish
|
2017-02-18 12:04:09 -08:00 |
|
Al
|
f76faafd8c
|
[openaddresses] adding a few house number phrases as well in Colombia
|
2017-02-18 12:03:02 -08:00 |
|
Al
|
adfdc06d14
|
[addresses] using the number dictionary for abbreviations in house number phrases as well
|
2017-02-18 12:00:27 -08:00 |
|
Al
|
7cab675809
|
[openaddresses] adding random formatting to Colombian house numbers that match the {calle}-{building number} format
|
2017-02-18 11:28:47 -08:00 |
|
Al
|
146412f4f8
|
[openaddresses] adding country-specific validators and doing no validation on house numbers in Colombia
|
2017-02-18 11:04:02 -08:00 |
|
Al
|
0e10aa6f46
|
[openaddresses] adding OSM boundaries for Stearns County, MN
|
2017-02-18 10:18:09 -08:00 |
|
Al
|
5a31513092
|
[openaddresses] Adding city of Sioux Falls, SD
|
2017-02-18 10:13:56 -08:00 |
|
Al
|
64e62cac32
|
[openaddresses] adding Bogotá, Colombia
|
2017-02-18 10:13:31 -08:00 |
|
Al
|
4f128579d6
|
[openaddresses] adding Commerce City, CO and creating an alias for the simple unit regex for reuse
|
2017-02-17 14:07:00 -05:00 |
|
Al
|
b88487f633
|
[utils] string_replace_char does single byte/character replacement, new string_replace to do full string replacement, again using char_array for safety, string_replace_with_array function for memory reuse
|
2017-02-17 13:58:51 -05:00 |
|
Al
|
da856ea5c3
|
[parser] adding phrase features for category, unit, level, entrance, staircase, and po_box phrases from the libpostal dictionaries, excluding phrases which match the toponyms dictionary (e.g. US states that can also be found in street/venue names, useful for expansion but not here), if the current token is part of both an address dictionary phrase and a component phrase derived from the training data, use the longer of the two, or both if they are the same length
|
2017-02-17 03:00:48 -05:00 |
|
Al
|
5b616dfb57
|
[addresses] allowing neighborhood components to be passed in
|
2017-02-17 02:11:56 -05:00 |
|
Al
|
e7d8577ad7
|
[openaddresses] add city of San Luis Obispo
|
2017-02-16 16:00:23 -05:00 |
|
Al
|
d6281648dc
|
[openaddresses] add Cumberland County, NC
|
2017-02-16 14:49:00 -05:00 |
|
Al
|
1631c25ad0
|
[openaddresses] add city of O'Fallon, IL
|
2017-02-16 14:48:40 -05:00 |
|
Al
|
4c4147f465
|
[openaddresses] add city of Scotsdale, AZ
|
2017-02-16 14:48:17 -05:00 |
|
Al
|
df76cde1e7
|
[openaddresses] adding Pickens County, SC
|
2017-02-16 03:34:49 -05:00 |
|
Al
|
c380b3e91b
|
[parser] phrase search with address dictionaries should not use the language given at training time since it's not currently available at runtime (without pulling in the language classifier, which may be warranted at some point, especially if the model can be made smaller/sparser)
|
2017-02-15 22:32:30 -05:00 |
|
Al
|
a3e51db32d
|
[api] include some of the new components in default address_components for the libpostal expansion API
|
2017-02-15 22:29:22 -05:00 |
|
Al
|
32fb483e96
|
[gazetteers] adding ADDRESS_PO_BOX component
|
2017-02-15 22:23:28 -05:00 |
|
Al
|
ba0ccc82a3
|
[fix] var name in address_parser_train
|
2017-02-15 22:22:33 -05:00 |
|
Al
|
0196fe8736
|
[utils] fixing key_type in hash_get, adding int64_double map
|
2017-02-15 22:20:36 -05:00 |
|
Al
|
be6f48f109
|
[fix] that didn't work, set log level to CRITICAL
|
2017-02-15 14:06:57 -05:00 |
|
Al
|
26bf617a06
|
[fix] prevent Shapely from logging to console
|
2017-02-15 14:00:51 -05:00 |
|
Al
|
a0b508caf6
|
[transliteration] adding no-args option for transliteration_rules script
|
2017-02-15 13:22:33 -05:00 |
|
Al
|
8abfa766fd
|
[fix] paren
|
2017-02-15 02:26:18 -05:00 |
|
Al
|
06003dfbb0
|
[fix] lower probability of name:prefix
|
2017-02-14 18:57:31 -05:00 |
|
Al
|
92b34f6af4
|
[fix] var name
|
2017-02-14 18:53:53 -05:00 |
|
Al
|
ca79342636
|
[fix] config
|
2017-02-14 18:50:51 -05:00 |
|
Al
|
8eafc5730b
|
[parser] adding long-context features which help classify the first token in the string by finding the relative positions of a) the first numeric token and b) the first street-level phrase like "Ave" or "Calle"
|
2017-02-14 18:42:51 -05:00 |
|
Al
|
08976c772e
|
[neighborhoods] base parser config changes for new prefix/first_match options
|
2017-02-14 18:19:15 -05:00 |
|