Matthew Honnibal
a6a2159969
Add slot for text categories to Doc
2017-07-22 00:34:15 +02:00
Matthew Honnibal
6782eedf9b
Tmp GPU code
2017-05-07 11:04:24 -05:00
Matthew Honnibal
5d5742b773
Add sentiment field to doc, rename getters_for_tokens and getters_for_spans, add user_hooks field to Doc.
2016-10-19 20:54:22 +02:00
Matthew Honnibal
fbb7f3f15c
Add user_data attribute to Doc object.
2016-10-17 11:43:22 +02:00
Matthew Honnibal
ae11ea8240
Add getters_for_tokens and getters_for_spans attributes to Doc object.
2016-10-17 02:42:05 +02:00
Matthew Honnibal
f3be9d0a9a
Add tensor field to Lexeme, Token, Doc and Span, so that users have a place to hang neural network outputs
2016-10-14 03:24:13 +02:00
Matthew Honnibal
276fbe9996
* Fix assignment of iterator on Doc object
2016-05-02 15:26:24 +02:00
Wolfgang Seeker
5e2e8e951a
add baseclass DocIterator for iterators over documents
...
add classes for English and German noun chunks
the respective iterators are set for the document when created by the parser
as they depend on the annotation scheme of the parsing model
2016-03-16 15:53:35 +01:00
Matthew Honnibal
6bb007d16e
* Make set_parse nogil
2016-01-30 20:27:52 +01:00
Matthew Honnibal
56499d89ef
* Rework the Span-merge patch, to avoid extending the interface of Doc, and avoid virtualizing the Span.start and Span.end indices, to keep Span usage efficient
2015-11-07 08:55:34 +11:00
Matthew Honnibal
68f479e821
* Rename Doc.data to Doc.c
2015-11-04 00:15:14 +11:00
Matthew Honnibal
77856c4fcd
* Try giving Doc and Span objects vector and vector_norm attributes, and .similarity functions. Turns out to be bad idea.
2015-09-17 11:50:11 +10:00
Matthew Honnibal
c2307fa9ee
* More work on language-generic parsing
2015-08-28 02:02:33 +02:00
Matthew Honnibal
9c1724ecae
* Gazetteer stuff working, now need to wire up to API
2015-08-06 00:35:40 +02:00
Matthew Honnibal
6609fcf4b2
* Make mem and vocab python-visible in Doc
2015-07-28 20:46:59 +02:00
Matthew Honnibal
8214b74eec
* Restore _py_tokens cache, to handle orphan tokens.
2015-07-13 22:28:10 +02:00
Matthew Honnibal
67641f3b58
* Refactor tokenizer, to set the 'spacy' field on TokenC instead of passing a string
2015-07-13 21:46:02 +02:00
Matthew Honnibal
6eef0bf9ab
* Break up tokens.pyx into tokens/doc.pyx, tokens/token.pyx, tokens/spans.pyx
2015-07-13 20:20:58 +02:00
Matthew Honnibal
3ea8756c24
* Add spacy/tokens/doc.pyx, for Doc class in its own file
2015-07-13 19:58:26 +02:00