Update v2-2 docs

2025-12-01 15:25:44 +03:00 · 2019-09-18 14:07:55 +02:00 · 2019-09-18 14:07:55 +02:00 · f537cbeacc
commit f537cbeacc
parent fa9a283128
1 changed files with 32 additions and 3 deletions
--- a/website/docs/usage/v2-2.md
+++ b/website/docs/usage/v2-2.md
@ -98,9 +98,10 @@ on disk**.
 > #### Example
 >
-> ```python
+> ```bash
-> scorer = nlp.evaluate(dev_data)
+> spacy train en /path/to/output /path/to/train /path/to/dev \
-> print(scorer.textcat_scores, scorer.textcats_per_cat)
+>   --pipeline textcat \
 >   --textcat-arch simple_cnn --textcat-multilabel
 > ```
 When training your models using the `spacy train` command, you can now also
@ -117,6 +118,34 @@ classification.
 </Infobox>
 ### New DocPallet class to efficiently Doc collections
 > #### Example
 > 
 > ```python
 > from spacy.tokens import DocPallet
 > pallet = DocPallet(attrs=["LEMMA", "ENT_IOB", "ENT_TYPE"], store_user_data=False)
 > for doc in nlp.pipe(texts):
 >     pallet.add(doc)
 > byte_data = pallet.to_bytes()
 > # Deserialize later, e.g. in a new process
 > nlp = spacy.blank("en")
 > pallet = DocPallet()
 > docs = list(pallet.get_docs(nlp.vocab))
 > ```
 If you're working with lots of data, you'll probably need to pass analyses
 between machines, either to use something like Dask or Spark, or even just to
 save out work to disk. Often it's sufficient to use the doc.to_array()
 functionality for this, and just serialize the numpy arrays --- but other times
 you want a more general way to save and restore `Doc` objects.
 The new `DocPallet` class makes it easy to serialize and deserialize
 a collection of `Doc` objects together, and is much more efficient than
 calling `doc.to_bytes()` on each individual `Doc` object. You can also control
 what data gets saved, and you can merge pallets together for easy
 map/reduce-style processing.
 ### CLI command to debug and validate training data {#debug-data}
 > #### Example