Matthew Honnibal
|
b8d34531c4
|
* Add support for units to English.__init__, by loading and applying regular expressions
|
2015-04-07 04:02:32 +02:00 |
|
Matthew Honnibal
|
2fee67cfa3
|
* Add regular expressions for English multi-word expressions
|
2015-04-07 03:45:18 +02:00 |
|
Matthew Honnibal
|
567388e38d
|
* Use values encoded by StringStore in POS tagging, rather than indices into a list of tags
|
2015-03-26 16:44:45 +01:00 |
|
Matthew Honnibal
|
801bf14f4f
|
* Clean up handling of dep_strings and ent_strings, using StringStore to encode the label names.
|
2015-03-26 16:44:45 +01:00 |
|
Matthew Honnibal
|
f21ab2d7fb
|
* Fix bug in ugly ent_strings hack on English class
|
2015-03-26 16:44:45 +01:00 |
|
Matthew Honnibal
|
8057a95f20
|
* NER seems to be working, scoring 69 F. Need to add decision-history features --- currently only use current word, 2 words context. Need refactoring.
|
2015-03-26 16:44:44 +01:00 |
|
Matthew Honnibal
|
220ce8bfed
|
* Prepare English class for NER
|
2015-03-26 16:44:44 +01:00 |
|
Matthew Honnibal
|
179b7eb0a7
|
* Specify parser transition system in language
|
2015-03-26 16:44:43 +01:00 |
|
Matthew Honnibal
|
8cc3524dc9
|
* Ws
|
2015-03-26 16:44:41 +01:00 |
|
Matthew Honnibal
|
2e8d0e5d45
|
* Upd download script
|
2015-03-03 05:47:16 -05:00 |
|
Matthew Honnibal
|
caf046b220
|
* Hastily add method to apply tags from a list of strings, instead of predicting the tags.
|
2015-02-23 15:40:17 -05:00 |
|
Matthew Honnibal
|
64645a1c2f
|
* Improve docstring on English
|
2015-02-11 15:13:20 -05:00 |
|
Matthew Honnibal
|
594e50bd45
|
* Add option to download speech-parsing data set.
|
2015-02-11 14:20:29 -05:00 |
|
Matthew Honnibal
|
0b7e769211
|
* Add POS tags to support SWBD tag set
|
2015-02-11 14:08:28 -05:00 |
|
Matthew Honnibal
|
312b3a45f3
|
* Fix issue #19: Allow parsing/pos tagging of empty strings
|
2015-02-10 10:15:58 -05:00 |
|
Matthew Honnibal
|
2a0615104b
|
* Upd download script
|
2015-02-09 10:22:59 -05:00 |
|
Matthew Honnibal
|
5c3513583d
|
* Clear buffered python tokens when modifying the Tokens object. Need to clean this up, and modify via a method on Tokens.
|
2015-02-09 03:57:10 -05:00 |
|
Matthew Honnibal
|
be5536d239
|
* Fix Issue #22: PRP and PRP$ were mapped to NOUN. Should be PRON.
|
2015-02-08 18:36:18 -05:00 |
|
Matthew Honnibal
|
44c7eafe44
|
* Fix download.py
|
2015-02-07 12:00:36 -05:00 |
|
Matthew Honnibal
|
6ca7f2eedc
|
* Upd download script
|
2015-02-07 11:32:33 -05:00 |
|
Matthew Honnibal
|
56c2ef2982
|
* Tweak POS features for web text
|
2015-02-02 11:59:36 +11:00 |
|
Matthew Honnibal
|
a20fdbd8ee
|
* Upd download script
|
2015-02-01 13:22:23 +11:00 |
|
Matthew Honnibal
|
63abdf154c
|
* Hastily hack download file
|
2015-01-31 22:48:32 +11:00 |
|
Matthew Honnibal
|
a1ed574b7b
|
* Fix default model path for English
|
2015-01-31 16:38:27 +11:00 |
|
Matthew Honnibal
|
e013555b25
|
* Add option to download script
|
2015-01-31 13:51:56 +11:00 |
|
Matthew Honnibal
|
024cfd485c
|
* Pass tag_strings as a tuple, to support new Tokens API
|
2015-01-31 13:43:37 +11:00 |
|
Matthew Honnibal
|
83a4df5a1a
|
* Fix download script
|
2015-01-30 20:40:42 +11:00 |
|
Matthew Honnibal
|
6f9ebc2f34
|
* Fix download script
|
2015-01-30 20:33:19 +11:00 |
|
Matthew Honnibal
|
8b85d0bb8a
|
* Only download small data if no data dir exists
|
2015-01-30 20:27:14 +11:00 |
|
Matthew Honnibal
|
cb95ef6934
|
* Fix download script
|
2015-01-30 19:28:43 +11:00 |
|
Matthew Honnibal
|
e578bd37bd
|
* Fix download script
|
2015-01-30 18:59:31 +11:00 |
|
Matthew Honnibal
|
df52014d12
|
* Fix download script
|
2015-01-30 18:36:24 +11:00 |
|
Matthew Honnibal
|
998b607f65
|
* Upd download script, having it download all data if there's no data/ directory, allowing easier compilation from source
|
2015-01-30 18:04:01 +11:00 |
|
Matthew Honnibal
|
67d6e53a69
|
* Ensure parser and tagger function correctly when training from missing values, indicated by -1
|
2015-01-30 14:08:56 +11:00 |
|
Matthew Honnibal
|
c38c62d4a3
|
* Add docstring to English class
|
2015-01-27 02:45:21 +11:00 |
|
Matthew Honnibal
|
7f87716cf7
|
* Fix download script
|
2015-01-25 23:01:10 +11:00 |
|
Matthew Honnibal
|
12b034e3ef
|
* Move POS tag definitions to parts_of_speech.pxd
|
2015-01-25 16:31:07 +11:00 |
|
Matthew Honnibal
|
7431c133d8
|
* Add error if try to access head and not is_parsed
|
2015-01-25 15:33:54 +11:00 |
|
Matthew Honnibal
|
951d06c824
|
* Silently don't parse if data is not present
|
2015-01-25 14:47:38 +11:00 |
|
Matthew Honnibal
|
4e857ab7a6
|
* Fix bug in POS tagger feature
|
2015-01-25 02:20:15 +11:00 |
|
Matthew Honnibal
|
dd56e298e2
|
* Ensure tagging is applied if parse=True
|
2015-01-25 02:19:44 +11:00 |
|
Matthew Honnibal
|
94750819cd
|
* Set parse=True by default --- i.e. parse unless told not to.
|
2015-01-25 01:28:28 +11:00 |
|
Matthew Honnibal
|
a97bed9359
|
* Fix POS and dependency label tag names. Add parse and string navigation functions.
|
2015-01-24 17:29:04 +11:00 |
|
Matthew Honnibal
|
fda94271af
|
* Rename NORM1 and NORM2 attrs to lower and norm
|
2015-01-24 06:17:03 +11:00 |
|
Matthew Honnibal
|
5ed8b2b98f
|
* Rename sic to orth
|
2015-01-23 02:08:25 +11:00 |
|
Matthew Honnibal
|
f2a229136c
|
* Fix data_dir=None argument to English class
|
2015-01-21 18:27:31 +11:00 |
|
Matthew Honnibal
|
ef49b8c179
|
* Add stop-word flag
|
2015-01-21 18:22:31 +11:00 |
|
Matthew Honnibal
|
6646bfc5df
|
* Add LOWER attr
|
2015-01-21 18:19:08 +11:00 |
|
Matthew Honnibal
|
6c7e44140b
|
* Work on word vectors, and other stuff
|
2015-01-17 16:21:17 +11:00 |
|
Matthew Honnibal
|
7d3c40de7d
|
* Tests passing after refactor. API has obvious warts, particularly in Token and Lexeme
|
2015-01-15 00:33:16 +11:00 |
|