mirror of
https://github.com/explosion/spaCy.git
synced 2025-06-05 21:53:05 +03:00
* Update docs for v0.80
This commit is contained in:
parent
3faaad0271
commit
5ce51ce8d6
|
@ -5,29 +5,32 @@ API
|
||||||
|
|
||||||
.. autoclass:: spacy.en.English
|
.. autoclass:: spacy.en.English
|
||||||
|
|
||||||
+-----------+----------------------------------------+-------------+--------------------------+
|
+------------+----------------------------------------+-------------+--------------------------+
|
||||||
| Attribute | Type | Attr API | NoteS |
|
| Attribute | Type | Attr API | Notes |
|
||||||
+===========+========================================+=============+==========================+
|
+============+========================================+=============+==========================+
|
||||||
| strings | :py:class:`strings.StringStore` | __getitem__ | string <-> int mapping |
|
| strings | :py:class:`strings.StringStore` | __getitem__ | string <-> int mapping |
|
||||||
+-----------+----------------------------------------+-------------+--------------------------+
|
+------------+----------------------------------------+-------------+--------------------------+
|
||||||
| vocab | :py:class:`vocab.Vocab` | __getitem__ | Look up Lexeme object |
|
| vocab | :py:class:`vocab.Vocab` | __getitem__ | Look up Lexeme object |
|
||||||
+-----------+----------------------------------------+-------------+--------------------------+
|
+------------+----------------------------------------+-------------+--------------------------+
|
||||||
| tokenizer | :py:class:`tokenizer.Tokenizer` | __call__ | Get Tokens given unicode |
|
| tokenizer | :py:class:`tokenizer.Tokenizer` | __call__ | Get Tokens given unicode |
|
||||||
+-----------+----------------------------------------+-------------+--------------------------+
|
+------------+----------------------------------------+-------------+--------------------------+
|
||||||
| tagger | :py:class:`en.pos.EnPosTagger` | __call__ | Set POS tags on Tokens |
|
| tagger | :py:class:`en.pos.EnPosTagger` | __call__ | Set POS tags on Tokens |
|
||||||
+-----------+----------------------------------------+-------------+--------------------------+
|
+------------+----------------------------------------+-------------+--------------------------+
|
||||||
| parser | :py:class:`syntax.parser.GreedyParser` | __call__ | Set parse on Tokens |
|
| parser | :py:class:`syntax.parser.GreedyParser` | __call__ | Set parse on Tokens |
|
||||||
+-----------+----------------------------------------+-------------+--------------------------+
|
+------------+----------------------------------------+-------------+--------------------------+
|
||||||
|
| entity | :py:class:`syntax.parser.GreedyParser` | __call__ | Set entities on Tokens |
|
||||||
|
+------------+----------------------------------------+-------------+--------------------------+
|
||||||
|
| mwe_merger | :py:class:`multi_words.RegexMerger` | __call__ | Apply regex for units |
|
||||||
|
+------------+----------------------------------------+-------------+--------------------------+
|
||||||
|
|
||||||
|
|
||||||
.. automethod:: spacy.en.English.__call__
|
.. automethod:: spacy.en.English.__call__
|
||||||
|
|
||||||
|
|
||||||
.. autoclass:: spacy.tokens.Tokens
|
.. autoclass:: spacy.tokens.Tokens
|
||||||
:members:
|
|
||||||
|
|
||||||
+---------------+-------------+-------------+
|
+---------------+-------------+-------------+
|
||||||
| Attribute | Type | Useful |
|
| Attribute | Type | Attr API |
|
||||||
+===============+=============+=============+
|
+===============+=============+=============+
|
||||||
| vocab | Vocab | __getitem__ |
|
| vocab | Vocab | __getitem__ |
|
||||||
+---------------+-------------+-------------+
|
+---------------+-------------+-------------+
|
||||||
|
@ -47,8 +50,6 @@ API
|
||||||
access is required, and you need slightly better performance. However, this
|
access is required, and you need slightly better performance. However, this
|
||||||
is both slower and has a worse API than Cython access.
|
is both slower and has a worse API than Cython access.
|
||||||
|
|
||||||
.. Once a Token object has been created, it is persisted internally in Tokens._py_tokens.
|
|
||||||
|
|
||||||
|
|
||||||
.. autoclass:: spacy.tokens.Token
|
.. autoclass:: spacy.tokens.Token
|
||||||
|
|
||||||
|
@ -192,6 +193,15 @@ API
|
||||||
An iterator for the part of the sentence syntactically governed by the
|
An iterator for the part of the sentence syntactically governed by the
|
||||||
word, including the word itself.
|
word, including the word itself.
|
||||||
|
|
||||||
|
|
||||||
|
**Named Entities**
|
||||||
|
|
||||||
|
ent_type
|
||||||
|
If the token is part of an entity, its entity type
|
||||||
|
|
||||||
|
ent_iob
|
||||||
|
The IOB (inside, outside, begin) entity recognition tag for the token
|
||||||
|
|
||||||
.. py:class:: vocab.Vocab(self, data_dir=None, lex_props_getter=None)
|
.. py:class:: vocab.Vocab(self, data_dir=None, lex_props_getter=None)
|
||||||
|
|
||||||
.. py:method:: __len__(self) --> int
|
.. py:method:: __len__(self) --> int
|
||||||
|
|
|
@ -68,7 +68,7 @@ a convenient API:
|
||||||
|
|
||||||
spaCy maps all strings to sequential integer IDs --- a common trick in NLP.
|
spaCy maps all strings to sequential integer IDs --- a common trick in NLP.
|
||||||
If an attribute `Token.foo` is an integer ID, then `Token.foo_` is the string,
|
If an attribute `Token.foo` is an integer ID, then `Token.foo_` is the string,
|
||||||
e.g. `pizza.orth_` and `pizza.orth` provide the integer ID and the string of
|
e.g. `pizza.orth` and `pizza.orth_` provide the integer ID and the string of
|
||||||
the original orthographic form of the word.
|
the original orthographic form of the word.
|
||||||
|
|
||||||
.. note:: en.English.__call__ is stateful --- it has an important **side-effect**.
|
.. note:: en.English.__call__ is stateful --- it has an important **side-effect**.
|
||||||
|
@ -88,7 +88,7 @@ the original orthographic form of the word.
|
||||||
|
|
||||||
.. py:class:: spacy.en.English(self, data_dir=join(dirname(__file__), 'data'))
|
.. py:class:: spacy.en.English(self, data_dir=join(dirname(__file__), 'data'))
|
||||||
|
|
||||||
.. py:method:: __call__(self, text: unicode, tag=True, parse=False) --> Tokens
|
.. py:method:: __call__(self, text: unicode, tag=True, parse=True, entity=True, merge_mwes=False) --> Tokens
|
||||||
|
|
||||||
+-----------------+--------------+--------------+
|
+-----------------+--------------+--------------+
|
||||||
| Attribute | Type | Its API |
|
| Attribute | Type | Its API |
|
||||||
|
@ -103,6 +103,8 @@ the original orthographic form of the word.
|
||||||
+-----------------+--------------+--------------+
|
+-----------------+--------------+--------------+
|
||||||
| parser | GreedyParser | __call__ |
|
| parser | GreedyParser | __call__ |
|
||||||
+-----------------+--------------+--------------+
|
+-----------------+--------------+--------------+
|
||||||
|
| entity | GreedyParser | __call__ |
|
||||||
|
+-----------------+--------------+--------------+
|
||||||
|
|
||||||
**Get dict or numpy array:**
|
**Get dict or numpy array:**
|
||||||
|
|
||||||
|
@ -116,6 +118,16 @@ the original orthographic form of the word.
|
||||||
|
|
||||||
.. py:method:: tokens.Tokens.__iter__(self) --> Iterator[Token]
|
.. py:method:: tokens.Tokens.__iter__(self) --> Iterator[Token]
|
||||||
|
|
||||||
|
**Get sentence or named entity spans**
|
||||||
|
|
||||||
|
.. py:attribute:: tokens.Tokens.sents --> Iterator[Span]
|
||||||
|
|
||||||
|
.. py:attribute:: tokens.Tokens.ents --> Iterator[Span]
|
||||||
|
|
||||||
|
You can iterate over a Span to access individual Tokens, or access its
|
||||||
|
start, end or label.
|
||||||
|
|
||||||
|
|
||||||
**Embedded word representenations**
|
**Embedded word representenations**
|
||||||
|
|
||||||
.. py:attribute:: tokens.Token.repvec
|
.. py:attribute:: tokens.Token.repvec
|
||||||
|
@ -150,7 +162,6 @@ the original orthographic form of the word.
|
||||||
Starting offset of word in the original string.
|
Starting offset of word in the original string.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Features
|
Features
|
||||||
--------
|
--------
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue
Block a user