From cdc1a2710421cd053b17fcfb0dfb17bd66a1e1af Mon Sep 17 00:00:00 2001
From: Matthew Honnibal <honnibal@gmail.com>
Date: Tue, 30 Dec 2014 21:20:34 +1100
Subject: [PATCH] * Update docs

---
 docs/source/api.rst                      |  4 +
 docs/source/how/api/index.rst            |  8 --
 docs/source/how/api/lexicon.rst          |  6 --
 docs/source/how/api/tokenizers/en.rst    | 94 ------------------------
 docs/source/how/api/tokenizers/index.rst |  8 --
 docs/source/how/index.rst                | 13 ----
 docs/source/index.rst                    |  2 +
 docs/source/what/index.rst               | 31 --------
 docs/source/why/index.rst                | 28 -------
 9 files changed, 6 insertions(+), 188 deletions(-)
 delete mode 100644 docs/source/how/api/index.rst
 delete mode 100644 docs/source/how/api/lexicon.rst
 delete mode 100644 docs/source/how/api/tokenizers/en.rst
 delete mode 100644 docs/source/how/api/tokenizers/index.rst
 delete mode 100644 docs/source/how/index.rst
 delete mode 100644 docs/source/what/index.rst
 delete mode 100644 docs/source/why/index.rst

diff --git a/docs/source/api.rst b/docs/source/api.rst
index 1a576c211..7e2b519c2 100644
--- a/docs/source/api.rst
+++ b/docs/source/api.rst
@@ -23,6 +23,10 @@ under spacy.en.defs.
 .. autommodule:: spacy.en.pos
    :members:
 
+.. automodule:: spacy.en.attrs
+   :members:
+   :undoc-members:
+
 The Tokens Class
 ----------------
 
diff --git a/docs/source/how/api/index.rst b/docs/source/how/api/index.rst
deleted file mode 100644
index 0bc6afeae..000000000
--- a/docs/source/how/api/index.rst
+++ /dev/null
@@ -1,8 +0,0 @@
-API
-===
-
-.. toctree::
-    :maxdepth: 2
-
-    tokenizers/index.rst
-    lexicon.rst
diff --git a/docs/source/how/api/lexicon.rst b/docs/source/how/api/lexicon.rst
deleted file mode 100644
index d1aeed990..000000000
--- a/docs/source/how/api/lexicon.rst
+++ /dev/null
@@ -1,6 +0,0 @@
-spacy.word.Lexeme
-=================
-
-
-.. autoclass:: spacy.word.Lexeme
-    :members:
diff --git a/docs/source/how/api/tokenizers/en.rst b/docs/source/how/api/tokenizers/en.rst
deleted file mode 100644
index 3c278c151..000000000
--- a/docs/source/how/api/tokenizers/en.rst
+++ /dev/null
@@ -1,94 +0,0 @@
-spacy.en.EN
-============
-
-.. automodule:: spacy.en
-
-Tokenizer API
--------------
-
-.. automethod:: spacy.en.EN.tokenize
-    :noindex:
-
-.. automethod:: spacy.en.EN.lookup
-    :noindex:
-
-Lexeme Features Flag IDs
-------------------------
-
-A number of boolean features are computed for English Lexemes. To access a feature,
-pass its ID to the :py:meth:`spacy.word.Lexeme.check_flag` function.
-
-Orthographic Features
----------------------
-
-These features describe the `orthographic` (lettering) type of the word. The
-function used to compute the value is listed along with the flag.
-
-.. data:: IS_ALPHA
-
-    :py:func:`spacy.orth.is_alpha`
-
-.. data:: IS_DIGIT
-    
-    :py:func:`spacy.orth.is_digit`
-    
-.. data:: IS_UPPER
-
-    :py:func:`spacy.orth.is_upper`
-
-.. data:: IS_PUNCT
-
-    :py:func:`spacy.orth.is_punct`
-
-.. data:: IS_SPACE
-
-    :py:func:`spacy.orth.is_space`
-
-.. data:: IS_ASCII
-
-    :py:func:`spacy.orth.is_ascii`
-
-.. data:: IS_TITLE
-
-    :py:func:`spacy.orth.is_title`
-
-.. data:: IS_LOWER
-
-    :py:func:`spacy.orth.is_lower`
-
-.. data:: IS_UPPER
-
-    :py:func:`spacy.orth.is_upper`
-
-Distributional Orthographic Features
-------------------------------------
-
-These features describe how often the lower-cased form of the word appears
-in various case-styles in a large sample of English text. See :py:func:`spacy.orth.oft_case`
-
-.. data:: OFT_UPPER
-.. data:: OFT_LOWER
-.. data:: OFT_TITLE
-
-
-Tag Dictionary Features
------------------------
-
-These features describe whether the word commonly occurs with a given
-part-of-speech, in a large text corpus, using a part-of-speech tagger designed
-to reduce the tag-dictionary bias of its training corpus. See
-:py:func:`spacy.orth.can_tag`.
-
-.. data:: CAN_PUNCT
-.. data:: CAN_CONJ
-.. data:: CAN_NUM
-.. data:: CAN_DET
-.. data:: CAN_ADP
-.. data:: CAN_ADJ
-.. data:: CAN_ADV
-.. data:: CAN_VERB
-.. data:: CAN_NOUN
-.. data:: CAN_PDT
-.. data:: CAN_POS
-.. data:: CAN_PRON
-.. data:: CAN_PRT
diff --git a/docs/source/how/api/tokenizers/index.rst b/docs/source/how/api/tokenizers/index.rst
deleted file mode 100644
index b19ca207d..000000000
--- a/docs/source/how/api/tokenizers/index.rst
+++ /dev/null
@@ -1,8 +0,0 @@
-Tokenizers
-===================================
-
-Each module listed here implements a different tokenization scheme, usually
-intended for a specific language.
-
-.. toctree::
-    en.rst
diff --git a/docs/source/how/index.rst b/docs/source/how/index.rst
deleted file mode 100644
index bd995f8c8..000000000
--- a/docs/source/how/index.rst
+++ /dev/null
@@ -1,13 +0,0 @@
-How
-===
-
-Tutorial
---------
-
-Installation
-------------
-
-API
----
-
-
diff --git a/docs/source/index.rst b/docs/source/index.rst
index ecfb9af37..e409fd88b 100644
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -79,9 +79,11 @@ you'll find NLTK etc much more expensive, because what you save on license
 cost, you'll lose many times over in lost productivity. $5000 does not buy you
 much developer time.
 
+
 .. toctree::
     :hidden:
     :maxdepth: 3
 
     features.rst
     license_stories.rst 
+    api.rst
diff --git a/docs/source/what/index.rst b/docs/source/what/index.rst
deleted file mode 100644
index 8f263fe5f..000000000
--- a/docs/source/what/index.rst
+++ /dev/null
@@ -1,31 +0,0 @@
-What
-====
-
-Overview
---------
-
-Feature List
-------------
-
-License (for the code)
--------
-
-+------------------+------+
-| Non-commercial   | $0   |
-+------------------+------+
-| Trial commercial | $0   |
-+------------------+------+
-| Full commercial  | $500 |
-+------------------+------+
-
-spaCy is non-free software. Its source is published, but the copyright is
-retained by the author (Matthew Honnibal).  Licenses are currently under preparation.
-
-There is currently a gap between the output of academic NLP researchers, and
-the needs of a small software companiess. I left academia to try to correct this.
-My idea is that non-commercial and trial commercial use should "feel" just like
-free software. But, if you do use the code in a commercial product, a small
-fixed license-fee will apply, in order to fund development. 
-
-Pricing (for the data)
-----------------------
diff --git a/docs/source/why/index.rst b/docs/source/why/index.rst
deleted file mode 100644
index 8f1f78272..000000000
--- a/docs/source/why/index.rst
+++ /dev/null
@@ -1,28 +0,0 @@
-Why
-===
-
-Benchmarks
-----------
-
-Efficiency
-----------
-
-+--------+-------+--------------+--------------+
-| System | Time	 | Words/second | Speed Factor |
-+--------+-------+--------------+--------------+
-| NLTK	 | 6m4s  | 89,000       | 1.00         |
-+--------+-------+--------------+--------------+
-| spaCy	 | 9.5s	 | 3,093,000	| 38.30        |
-+--------+-------+--------------+--------------+
-
-
-Accuracy
---------
-
-The comparison refers to 30 million words from the English Gigaword, on
-a Maxbook Air.  For context, calling string.split() on the data completes in
-about 5s.
-
-Pros and Cons
--------------
-