Matthew Honnibal
|
6dd3b94fa6
|
Filter out deprecated attributes when reading special-case tokenization rules.
|
2016-11-25 09:57:18 -06:00 |
|
Matthew Honnibal
|
a335c6dcc2
|
Exclude morphs from deprecated token attributes for now
|
2016-11-25 16:17:32 +01:00 |
|
Matthew Honnibal
|
846e80f2f4
|
Exclude morphs from deprecated token attributes for now
|
2016-11-25 16:14:54 +01:00 |
|
Matthew Honnibal
|
53d8ca8f51
|
Add spacy.attrs.intify_attrs function, to normalize strings in token attribute dictionaries.
|
2016-11-25 11:34:30 +01:00 |
|
Wolfgang Seeker
|
03fb498dbe
|
introduce lang field for LexemeC to hold language id
put noun_chunk logic into iterators.py for each language separately
|
2016-03-10 13:01:34 +01:00 |
|
Matthew Honnibal
|
c4017a06d9
|
* Add placeholders for the new flags in attrs and symbols
|
2016-02-04 15:49:45 +01:00 |
|
Matthew Honnibal
|
22bd0095f5
|
* Map empty string to NULL_ATTR in attrs
|
2015-10-10 22:10:19 +11:00 |
|
Matthew Honnibal
|
94bafc1417
|
* Rename ATTR_IDS to attrs.IDS. Rename ATTR_NAMES to attrs.NAMES. Rename UNIV_POS_IDS to parts_of_speech.IDS
|
2015-10-10 17:57:29 +11:00 |
|
Matthew Honnibal
|
064bd69ad0
|
* Refactor symbols, so that frequency rank can be derived from the orth id of a word.
|
2015-10-10 16:03:48 +11:00 |
|
Matthew Honnibal
|
44f39a876f
|
* Add a blank attrs.pyx
|
2015-07-17 16:40:42 +02:00 |
|