* Update docs

2025-07-15 18:52:29 +03:00 · 2014-12-30 21:20:34 +11:00 · 2014-12-30 21:20:34 +11:00 · cdc1a27104
commit cdc1a27104
parent bb0b00f819
9 changed files with 6 additions and 188 deletions
--- a/docs/source/api.rst
+++ b/docs/source/api.rst
@ -23,6 +23,10 @@ under spacy.en.defs.
 .. autommodule:: spacy.en.pos
   :members:

+.. automodule:: spacy.en.attrs
+   :members:
+   :undoc-members:
+
 The Tokens Class
 ----------------

--- a/docs/source/how/api/index.rst
+++ b/docs/source/how/api/index.rst
@ -1,8 +0,0 @@
-API
-===
-
-.. toctree::
-    :maxdepth: 2
-
-    tokenizers/index.rst
-    lexicon.rst
--- a/docs/source/how/api/lexicon.rst
+++ b/docs/source/how/api/lexicon.rst
@ -1,6 +0,0 @@
-spacy.word.Lexeme
-=================
-
-
-.. autoclass:: spacy.word.Lexeme
-    :members:
--- a/docs/source/how/api/tokenizers/en.rst
+++ b/docs/source/how/api/tokenizers/en.rst
@ -1,94 +0,0 @@
-spacy.en.EN
-============
-
-.. automodule:: spacy.en
-
-Tokenizer API
-------------
-
-.. automethod:: spacy.en.EN.tokenize
-    :noindex:
-
-.. automethod:: spacy.en.EN.lookup
-    :noindex:
-
-Lexeme Features Flag IDs
------------------------
-
-A number of boolean features are computed for English Lexemes. To access a feature,
-pass its ID to the :py:meth:`spacy.word.Lexeme.check_flag` function.
-
-Orthographic Features
---------------------
-
-These features describe the `orthographic` (lettering) type of the word. The
-function used to compute the value is listed along with the flag.
-
-.. data:: IS_ALPHA
-
-    :py:func:`spacy.orth.is_alpha`
-
-.. data:: IS_DIGIT
-    
-    :py:func:`spacy.orth.is_digit`
-    
-.. data:: IS_UPPER
-
-    :py:func:`spacy.orth.is_upper`
-
-.. data:: IS_PUNCT
-
-    :py:func:`spacy.orth.is_punct`
-
-.. data:: IS_SPACE
-
-    :py:func:`spacy.orth.is_space`
-
-.. data:: IS_ASCII
-
-    :py:func:`spacy.orth.is_ascii`
-
-.. data:: IS_TITLE
-
-    :py:func:`spacy.orth.is_title`
-
-.. data:: IS_LOWER
-
-    :py:func:`spacy.orth.is_lower`
-
-.. data:: IS_UPPER
-
-    :py:func:`spacy.orth.is_upper`
-
-Distributional Orthographic Features
------------------------------------
-
-These features describe how often the lower-cased form of the word appears
-in various case-styles in a large sample of English text. See :py:func:`spacy.orth.oft_case`
-
-.. data:: OFT_UPPER
-.. data:: OFT_LOWER
-.. data:: OFT_TITLE
-
-
-Tag Dictionary Features
-----------------------
-
-These features describe whether the word commonly occurs with a given
-part-of-speech, in a large text corpus, using a part-of-speech tagger designed
-to reduce the tag-dictionary bias of its training corpus. See
-:py:func:`spacy.orth.can_tag`.
-
-.. data:: CAN_PUNCT
-.. data:: CAN_CONJ
-.. data:: CAN_NUM
-.. data:: CAN_DET
-.. data:: CAN_ADP
-.. data:: CAN_ADJ
-.. data:: CAN_ADV
-.. data:: CAN_VERB
-.. data:: CAN_NOUN
-.. data:: CAN_PDT
-.. data:: CAN_POS
-.. data:: CAN_PRON
-.. data:: CAN_PRT
--- a/docs/source/how/api/tokenizers/index.rst
+++ b/docs/source/how/api/tokenizers/index.rst
@ -1,8 +0,0 @@
-Tokenizers
-===================================
-
-Each module listed here implements a different tokenization scheme, usually
-intended for a specific language.
-
-.. toctree::
-    en.rst
--- a/docs/source/how/index.rst
+++ b/docs/source/how/index.rst
@ -1,13 +0,0 @@
-How
-===
-
-Tutorial
--------
-
-Installation
------------
-
-API
---
-
-
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@ -79,9 +79,11 @@ you'll find NLTK etc much more expensive, because what you save on license
 cost, you'll lose many times over in lost productivity. $5000 does not buy you
 much developer time.

+
 .. toctree::
    :hidden:
    :maxdepth: 3

    features.rst
    license_stories.rst 
+    api.rst
--- a/docs/source/what/index.rst
+++ b/docs/source/what/index.rst
@ -1,31 +0,0 @@
-What
-====
-
-Overview
--------
-
-Feature List
------------
-
-License (for the code)
-------
-
-+------------------+------+
-| Non-commercial   | $0   |
-+------------------+------+
-| Trial commercial | $0   |
-+------------------+------+
-| Full commercial  | $500 |
-+------------------+------+
-
-spaCy is non-free software. Its source is published, but the copyright is
-retained by the author (Matthew Honnibal).  Licenses are currently under preparation.
-
-There is currently a gap between the output of academic NLP researchers, and
-the needs of a small software companiess. I left academia to try to correct this.
-My idea is that non-commercial and trial commercial use should "feel" just like
-free software. But, if you do use the code in a commercial product, a small
-fixed license-fee will apply, in order to fund development. 
-
-Pricing (for the data)
----------------------
--- a/docs/source/why/index.rst
+++ b/docs/source/why/index.rst
@ -1,28 +0,0 @@
-Why
-===
-
-Benchmarks
----------
-
-Efficiency
----------
-
-+--------+-------+--------------+--------------+
-| System | Time	 | Words/second | Speed Factor |
-+--------+-------+--------------+--------------+
-| NLTK	 | 6m4s  | 89,000       | 1.00         |
-+--------+-------+--------------+--------------+
-| spaCy	 | 9.5s	 | 3,093,000	| 38.30        |
-+--------+-------+--------------+--------------+
-
-
-Accuracy
--------
-
-The comparison refers to 30 million words from the English Gigaword, on
-a Maxbook Air.  For context, calling string.split() on the data completes in
-about 5s.
-
-Pros and Cons
-------------
-