* Edt API docs

This commit is contained in:
Matthew Honnibal 2015-01-24 20:49:44 +11:00
parent 71b95202eb
commit 83a7e91f3c

View File

@ -2,27 +2,28 @@
Documentation Documentation
=============== ===============
Quick Ref
---------
.. class:: spacy.en.__init__.English(self, data_dir=join(dirname(__file__, 'data')))
.. py:currentmodule:: spacy
.. class:: en.English(self, data_dir=join(dirname(__file__, 'data')))
:noindex: :noindex:
.. method:: __call__(self, unicode text, tag=True, parse=False) --> Tokens .. method:: __call__(self, unicode text, tag=True, parse=False) --> Tokens
+-----------+--------------+--------------+ +-----------+----------------------------------------+-------------+--------------------------+
| Attribute | Type | Useful | | Attribute | Type | Attr API | NoteS |
+===========+==============+==============+ +===========+========================================+=============+==========================+
| strings | StingStore | __getitem__ | | strings | :py:class:`strings.StringStore` | __getitem__ | string <-> int mapping |
+-----------+--------------+--------------+ +-----------+----------------------------------------+-------------+--------------------------+
| vocab | Vocab | __getitem__ | | vocab | :py:class:`vocab.Vocab` | __getitem__ | Look up Lexeme object |
+-----------+--------------+--------------+ +-----------+----------------------------------------+-------------+--------------------------+
| tokenizer | Tokenizer | __call__ | | tokenizer | :py:class:`tokenizer.Tokenizer` | __call__ | Get Tokens given unicode |
+-----------+--------------+--------------+ +-----------+----------------------------------------+-------------+--------------------------+
| tagger | EnPosTagger | __call__ | | tagger | :py:class:`en.pos.EnPosTagger` | __call__ | Set POS tags on Tokens |
+-----------+--------------+--------------+ +-----------+----------------------------------------+-------------+--------------------------+
| parser | GreedyParser | __call__ | | parser | :py:class:`syntax.parser.GreedyParser` | __call__ | Set parse on Tokens |
+-----------+--------------+--------------+ +-----------+----------------------------------------+-------------+--------------------------+
.. py:class:: spacy.tokens.Tokens(self, vocab: Vocab, string_length=0) .. py:class:: spacy.tokens.Tokens(self, vocab: Vocab, string_length=0)
@ -58,32 +59,30 @@ Quick Ref
.. py:attribute:: head: Token .. py:attribute:: head: Token
.. py:method:: check_flag(self, attr_id: int) --> bool
+-----------+------+-----------+---------+-----------+-------+ +-----------+------+-----------+---------+-----------+------------------------------------+
| Attribute | Type | Attribute | Type | Attribute | Type | | Attribute | Type | Attribute | Type | Attribute | Type |
+===========+======+===========+=========+===========+=======+ +===========+======+===========+=========+===========+====================================+
| sic | int | sic_ | unicode | idx | int | | orth | int | orth\_ | unicode | idx | int |
+-----------+------+-----------+---------+-----------+-------+ +-----------+------+-----------+---------+-----------+------------------------------------+
| lemma | int | lemma_ | unicode | cluster | int | | lemma | int | lemma\_ | unicode | cluster | int |
+-----------+------+-----------+---------+-----------+-------+ +-----------+------+-----------+---------+-----------+------------------------------------+
| norm1 | int | norm1_ | unicode | length | int | | lower | int | lower\_ | unicode | length | int |
+-----------+------+-----------+---------+-----------+-------+ +-----------+------+-----------+---------+-----------+------------------------------------+
| norm2 | int | norm2_ | unicode | prob | float | | norm | int | norm\_ | unicode | prob | float |
+-----------+------+-----------+---------+-----------+-------+ +-----------+------+-----------+---------+-----------+------------------------------------+
| shape | int | shape_ | unicode | sentiment | float | | shape | int | shape\_ | unicode | repvec | ndarray(shape=(300,), dtype=float) |
+-----------+------+-----------+---------+-----------+-------+ +-----------+------+-----------+---------+-----------+------------------------------------+
| prefix | int | prefix_ | unicode | | | prefix | int | prefix\_ | unicode | |
+-----------+------+-----------+---------+-------------------+ +-----------+------+-----------+---------+------------------------------------------------+
| suffix | int | suffix_ | unicode | | | suffix | int | suffix\_ | unicode | |
+-----------+------+-----------+---------+-------------------+ +-----------+------+-----------+---------+------------------------------------------------+
| pos | int | pos_ | unicode | | | pos | int | pos\_ | unicode | |
+-----------+------+-----------+---------+-------------------+ +-----------+------+-----------+---------+------------------------------------------------+
| fine_pos | int | fine_pos_ | unicode | | | tag | int | tag\_ | unicode | |
+-----------+------+-----------+---------+-------------------+ +-----------+------+-----------+---------+------------------------------------------------+
| dep_tag | int | dep_tag_ | unicode | | | dep | int | dep\_ | unicode | |
+-----------+------+-----------+---------+-------------------+ +-----------+------+-----------+---------+------------------------------------------------+
.. py:class:: spacy.vocab.Vocab(self, data_dir=None, lex_props_getter=None) .. py:class:: spacy.vocab.Vocab(self, data_dir=None, lex_props_getter=None)
@ -94,7 +93,7 @@ Quick Ref
.. py:method:: __getitem__(self, string: unicode) --> int .. py:method:: __getitem__(self, string: unicode) --> int
.. py:method:: __setitem__(self, py_str: unicode, props: Dict[str, int|float]) --> None .. py:method:: __setitem__(self, py_str: unicode, props: Dict[str, int[float]) --> None
.. py:method:: dump(self, loc: unicode) --> None .. py:method:: dump(self, loc: unicode) --> None
@ -108,7 +107,7 @@ Quick Ref
.. py:method:: __getitem__(self, id: int) --> unicode .. py:method:: __getitem__(self, id: int) --> unicode
.. py:method:: __getitem__(self, string: byts) --> id .. py:method:: __getitem__(self, string: bytes) --> id
.. py:method:: __getitem__(self, string: unicode) --> id .. py:method:: __getitem__(self, string: unicode) --> id
@ -137,69 +136,3 @@ Quick Ref
.. py:method:: __call__(self, tokens: spacy.tokens.Tokens) --> None .. py:method:: __call__(self, tokens: spacy.tokens.Tokens) --> None
.. py:method:: train(self, spacy.tokens.Tokens) --> None .. py:method:: train(self, spacy.tokens.Tokens) --> None
spaCy is designed to easily extend to multiple languages, although presently
only English components are implemented. The components are organised into
a pipeline in the spacy.en.English class.
Usually, you will only want to create one spacy.en.English object, and pass it
around your application. It manages the string-to-integers mapping, and you
will usually want only a single mapping for all strings.
English Pipeline
----------------
The spacy.en package exports a single class, English, and several constants,
under spacy.en.defs.
.. autoclass:: spacy.en.English
:members:
.. autommodule:: spacy.en.pos
:members:
.. automodule:: spacy.en.attrs
:members:
:undoc-members:
Tokens
------
.. autoclass:: spacy.tokens.Tokens
:members:
.. autoclass:: spacy.tokens.Token
:members:
.. autoclass:: spacy.lexeme.Lexeme
:members:
Lexicon
-------
.. automodule:: spacy.vocab
:members:
.. automodule:: spacy.strings
:members:
Tokenizer
---------
.. automodule:: spacy.tokenizer
:members:
Parser
------
.. automodule:: spacy.syntax.parser
:members:
Utility Functions
-----------------
.. automodule:: spacy.orth
:members: