* Add loading.rst reference

2025-11-22 10:45:45 +03:00 · 2015-07-08 15:13:47 +02:00 · 2015-07-08 15:13:47 +02:00 · 1a95b490a8
commit 1a95b490a8
parent 79abe2860a
1 changed files with 62 additions and 0 deletions
--- a/docs/source/reference/loading.rst
+++ b/docs/source/reference/loading.rst
@ -0,0 +1,62 @@
+=================
+Loading Resources
+=================
+
+99\% of the time, you will load spaCy's resources using a language pipeline class,
+e.g. `spacy.en.English`. The pipeline class reads the data from disk, from a
+specified directory.  By default, spaCy installs data into each language's
+package directory, and loads it from there.
+
+Usually, this is all you will need:
+
+    >>> from spacy.en import English
+    >>> nlp = English()
+
+If you need to replace some of the components, you may want to just make your
+own pipeline class --- the English class itself does almost no work; it just
+applies the modules in order. You can also provide a function or class that
+produces a tokenizer, tagger, parser or entity recognizer to :code:`English.__init__`,
+to customize the pipeline:
+
+    >>> from spacy.en import English
+    >>> from my_module import MyTagger
+    >>> nlp = English(Tagger=MyTagger)
+
+In more detail:
+
+.. code::
+
+  class English(object):
+      def __init__(self,
+        data_dir=path.join(path.dirname(__file__), 'data'),
+        Tokenizer=Tokenizer.from_dir,
+        Tagger=EnPosTagger,
+        Parser=Createarser(ArcEager),
+        Entity=CreateParser(BiluoNER),
+        load_vectors=True
+      ):
+
+:code:`data_dir`
+  :code:`unicode path`
+
+  The data directory.  May be None, to disable any data loading (including
+  the vocabulary).
+
+:code:`Tokenizer`
+  :code:`(Vocab vocab, unicode data_dir)(unicode) --> Tokens`
+  
+  A class/function that creates the tokenizer.
+
+:code:`Tagger` / :code:`Parser` / :code:`Entity`
+  :code:`(Vocab vocab, unicode data_dir)(Tokens) --> None`
+  
+  A class/function that creates the part-of-speech tagger /
+  syntactic dependency parser / named entity recogniser.
+  May be None or False, to disable tagging.
+
+:code:`load_vectors`
+  :code:`bool`
+  A boolean value to control whether the word vectors are loaded.
+
+
+