mirror of
https://github.com/explosion/spaCy.git
synced 2025-01-13 10:46:29 +03:00
* Update docs for v0.80
This commit is contained in:
parent
3faaad0271
commit
5ce51ce8d6
|
@ -5,29 +5,32 @@ API
|
|||
|
||||
.. autoclass:: spacy.en.English
|
||||
|
||||
+-----------+----------------------------------------+-------------+--------------------------+
|
||||
| Attribute | Type | Attr API | NoteS |
|
||||
+===========+========================================+=============+==========================+
|
||||
| strings | :py:class:`strings.StringStore` | __getitem__ | string <-> int mapping |
|
||||
+-----------+----------------------------------------+-------------+--------------------------+
|
||||
| vocab | :py:class:`vocab.Vocab` | __getitem__ | Look up Lexeme object |
|
||||
+-----------+----------------------------------------+-------------+--------------------------+
|
||||
| tokenizer | :py:class:`tokenizer.Tokenizer` | __call__ | Get Tokens given unicode |
|
||||
+-----------+----------------------------------------+-------------+--------------------------+
|
||||
| tagger | :py:class:`en.pos.EnPosTagger` | __call__ | Set POS tags on Tokens |
|
||||
+-----------+----------------------------------------+-------------+--------------------------+
|
||||
| parser | :py:class:`syntax.parser.GreedyParser` | __call__ | Set parse on Tokens |
|
||||
+-----------+----------------------------------------+-------------+--------------------------+
|
||||
+------------+----------------------------------------+-------------+--------------------------+
|
||||
| Attribute | Type | Attr API | Notes |
|
||||
+============+========================================+=============+==========================+
|
||||
| strings | :py:class:`strings.StringStore` | __getitem__ | string <-> int mapping |
|
||||
+------------+----------------------------------------+-------------+--------------------------+
|
||||
| vocab | :py:class:`vocab.Vocab` | __getitem__ | Look up Lexeme object |
|
||||
+------------+----------------------------------------+-------------+--------------------------+
|
||||
| tokenizer | :py:class:`tokenizer.Tokenizer` | __call__ | Get Tokens given unicode |
|
||||
+------------+----------------------------------------+-------------+--------------------------+
|
||||
| tagger | :py:class:`en.pos.EnPosTagger` | __call__ | Set POS tags on Tokens |
|
||||
+------------+----------------------------------------+-------------+--------------------------+
|
||||
| parser | :py:class:`syntax.parser.GreedyParser` | __call__ | Set parse on Tokens |
|
||||
+------------+----------------------------------------+-------------+--------------------------+
|
||||
| entity | :py:class:`syntax.parser.GreedyParser` | __call__ | Set entities on Tokens |
|
||||
+------------+----------------------------------------+-------------+--------------------------+
|
||||
| mwe_merger | :py:class:`multi_words.RegexMerger` | __call__ | Apply regex for units |
|
||||
+------------+----------------------------------------+-------------+--------------------------+
|
||||
|
||||
|
||||
.. automethod:: spacy.en.English.__call__
|
||||
|
||||
|
||||
.. autoclass:: spacy.tokens.Tokens
|
||||
:members:
|
||||
|
||||
+---------------+-------------+-------------+
|
||||
| Attribute | Type | Useful |
|
||||
| Attribute | Type | Attr API |
|
||||
+===============+=============+=============+
|
||||
| vocab | Vocab | __getitem__ |
|
||||
+---------------+-------------+-------------+
|
||||
|
@ -47,8 +50,6 @@ API
|
|||
access is required, and you need slightly better performance. However, this
|
||||
is both slower and has a worse API than Cython access.
|
||||
|
||||
.. Once a Token object has been created, it is persisted internally in Tokens._py_tokens.
|
||||
|
||||
|
||||
.. autoclass:: spacy.tokens.Token
|
||||
|
||||
|
@ -192,6 +193,15 @@ API
|
|||
An iterator for the part of the sentence syntactically governed by the
|
||||
word, including the word itself.
|
||||
|
||||
|
||||
**Named Entities**
|
||||
|
||||
ent_type
|
||||
If the token is part of an entity, its entity type
|
||||
|
||||
ent_iob
|
||||
The IOB (inside, outside, begin) entity recognition tag for the token
|
||||
|
||||
.. py:class:: vocab.Vocab(self, data_dir=None, lex_props_getter=None)
|
||||
|
||||
.. py:method:: __len__(self) --> int
|
||||
|
|
|
@ -68,7 +68,7 @@ a convenient API:
|
|||
|
||||
spaCy maps all strings to sequential integer IDs --- a common trick in NLP.
|
||||
If an attribute `Token.foo` is an integer ID, then `Token.foo_` is the string,
|
||||
e.g. `pizza.orth_` and `pizza.orth` provide the integer ID and the string of
|
||||
e.g. `pizza.orth` and `pizza.orth_` provide the integer ID and the string of
|
||||
the original orthographic form of the word.
|
||||
|
||||
.. note:: en.English.__call__ is stateful --- it has an important **side-effect**.
|
||||
|
@ -88,7 +88,7 @@ the original orthographic form of the word.
|
|||
|
||||
.. py:class:: spacy.en.English(self, data_dir=join(dirname(__file__), 'data'))
|
||||
|
||||
.. py:method:: __call__(self, text: unicode, tag=True, parse=False) --> Tokens
|
||||
.. py:method:: __call__(self, text: unicode, tag=True, parse=True, entity=True, merge_mwes=False) --> Tokens
|
||||
|
||||
+-----------------+--------------+--------------+
|
||||
| Attribute | Type | Its API |
|
||||
|
@ -103,6 +103,8 @@ the original orthographic form of the word.
|
|||
+-----------------+--------------+--------------+
|
||||
| parser | GreedyParser | __call__ |
|
||||
+-----------------+--------------+--------------+
|
||||
| entity | GreedyParser | __call__ |
|
||||
+-----------------+--------------+--------------+
|
||||
|
||||
**Get dict or numpy array:**
|
||||
|
||||
|
@ -116,6 +118,16 @@ the original orthographic form of the word.
|
|||
|
||||
.. py:method:: tokens.Tokens.__iter__(self) --> Iterator[Token]
|
||||
|
||||
**Get sentence or named entity spans**
|
||||
|
||||
.. py:attribute:: tokens.Tokens.sents --> Iterator[Span]
|
||||
|
||||
.. py:attribute:: tokens.Tokens.ents --> Iterator[Span]
|
||||
|
||||
You can iterate over a Span to access individual Tokens, or access its
|
||||
start, end or label.
|
||||
|
||||
|
||||
**Embedded word representenations**
|
||||
|
||||
.. py:attribute:: tokens.Token.repvec
|
||||
|
@ -150,7 +162,6 @@ the original orthographic form of the word.
|
|||
Starting offset of word in the original string.
|
||||
|
||||
|
||||
|
||||
Features
|
||||
--------
|
||||
|
||||
|
|
Loading…
Reference in New Issue
Block a user