mirror of
				https://github.com/explosion/spaCy.git
				synced 2025-11-04 01:48:04 +03:00 
			
		
		
		
	Update v3 docs WIP [ci skip]
This commit is contained in:
		
							parent
							
								
									fa261d09e8
								
							
						
					
					
						commit
						a35236e5f0
					
				| 
						 | 
					@ -106,7 +106,7 @@ systems, or to pre-process text for **deep learning**.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
- **spaCy is not a company**. It's an open-source library. Our company
 | 
					- **spaCy is not a company**. It's an open-source library. Our company
 | 
				
			||||||
  publishing spaCy and other software is called
 | 
					  publishing spaCy and other software is called
 | 
				
			||||||
  [Explosion AI](https://explosion.ai).
 | 
					  [Explosion](https://explosion.ai).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
## Features {#features}
 | 
					## Features {#features}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
| 
						 | 
					
 | 
				
			||||||
| 
						 | 
					@ -36,22 +36,19 @@ ready-to-use spaCy models.
 | 
				
			||||||
The recommended way to train your spaCy models is via the
 | 
					The recommended way to train your spaCy models is via the
 | 
				
			||||||
[`spacy train`](/api/cli#train) command on the command line.
 | 
					[`spacy train`](/api/cli#train) command on the command line.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
1. The **training data** in spaCy's
 | 
					1. The **training and evaluation data** in spaCy's
 | 
				
			||||||
   [binary format](/api/data-formats#binary-training) created using
 | 
					   [binary `.spacy` format](/api/data-formats#binary-training) created using
 | 
				
			||||||
   [`spacy convert`](/api/cli#convert).
 | 
					   [`spacy convert`](/api/cli#convert).
 | 
				
			||||||
2. A `config.cfg` **configuration file** with all settings and hyperparameters.
 | 
					2. A [`config.cfg`](#config) **configuration file** with all settings and
 | 
				
			||||||
 | 
					   hyperparameters.
 | 
				
			||||||
3. An optional **Python file** to register
 | 
					3. An optional **Python file** to register
 | 
				
			||||||
   [custom models and architectures](#custom-models).
 | 
					   [custom models and architectures](#custom-models).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
<!-- TODO: decide how we want to present the "getting started" workflow here, get a default config etc. -->
 | 
					<!-- TODO: decide how we want to present the "getting started" workflow here, get a default config etc. -->
 | 
				
			||||||
 | 
					
 | 
				
			||||||
<Project id="some_example_project">
 | 
					```bash
 | 
				
			||||||
 | 
					$ python -m spacy train train.spacy dev.spacy config.cfg --output ./output
 | 
				
			||||||
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus interdum
 | 
					```
 | 
				
			||||||
sodales lectus, ut sodales orci ullamcorper id. Sed condimentum neque ut erat
 | 
					 | 
				
			||||||
mattis pretium.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
</Project>
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
> #### Tip: Debug your data
 | 
					> #### Tip: Debug your data
 | 
				
			||||||
>
 | 
					>
 | 
				
			||||||
| 
						 | 
					@ -60,9 +57,17 @@ mattis pretium.
 | 
				
			||||||
> invalid entity annotations, cyclic dependencies, low data labels and more.
 | 
					> invalid entity annotations, cyclic dependencies, low data labels and more.
 | 
				
			||||||
>
 | 
					>
 | 
				
			||||||
> ```bash
 | 
					> ```bash
 | 
				
			||||||
> $ python -m spacy debug-data en train.json dev.json --verbose
 | 
					> $ python -m spacy debug-data en train.spacy dev.spacy --verbose
 | 
				
			||||||
> ```
 | 
					> ```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					<Project id="some_example_project">
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The easiest way to get started with an end-to-end training process is to clone a
 | 
				
			||||||
 | 
					[project](/usage/projects) template. Projects let you manage multi-step
 | 
				
			||||||
 | 
					workflows, from data preprocessing to training and packaging your model.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					</Project>
 | 
				
			||||||
 | 
					
 | 
				
			||||||
<Accordion title="Understanding the training output">
 | 
					<Accordion title="Understanding the training output">
 | 
				
			||||||
 | 
					
 | 
				
			||||||
When you train a model using the [`spacy train`](/api/cli#train) command, you'll
 | 
					When you train a model using the [`spacy train`](/api/cli#train) command, you'll
 | 
				
			||||||
| 
						 | 
					@ -94,7 +99,28 @@ still look good.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
---
 | 
					---
 | 
				
			||||||
 | 
					
 | 
				
			||||||
### Training config files {#cli}
 | 
					### Training config files {#config}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					> #### Migration from spaCy v2.x
 | 
				
			||||||
 | 
					>
 | 
				
			||||||
 | 
					> TODO: ...
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Training config files include all **settings and hyperparameters** for training
 | 
				
			||||||
 | 
					your model. Instead of providing lots of arguments on the command line, you only
 | 
				
			||||||
 | 
					need to pass your `config.cfg` file to [`spacy train`](/api/cli#train).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					To read more about how the config system works under the hood, check out the
 | 
				
			||||||
 | 
					[Thinc documentation](https://thinc.ai/docs/usage-config).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- **Structured sections.**
 | 
				
			||||||
 | 
					- **References to registered functions.** Sections can refer to registered
 | 
				
			||||||
 | 
					  functions like [model architectures](/api/architectures),
 | 
				
			||||||
 | 
					  [optimizers](https://thinc.ai/docs/api-optimizers) or
 | 
				
			||||||
 | 
					  [schedules](https://thinc.ai/docs/api-schedules) and define arguments that are
 | 
				
			||||||
 | 
					  passed into them. You can also register your own functions to define
 | 
				
			||||||
 | 
					  [custom architectures](#custom-models), reference them in your config,
 | 
				
			||||||
 | 
					- **Interpolation.** If you have hyperparameters used by multiple components,
 | 
				
			||||||
 | 
					  define them once and reference them as variables.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
<!-- TODO: we need to come up with a good way to present the sections and their expected values visually? -->
 | 
					<!-- TODO: we need to come up with a good way to present the sections and their expected values visually? -->
 | 
				
			||||||
 | 
					
 | 
				
			||||||
| 
						 | 
					@ -174,6 +200,7 @@ mattis pretium.
 | 
				
			||||||
### Training with custom code
 | 
					### Training with custom code
 | 
				
			||||||
 | 
					
 | 
				
			||||||
<!-- TODO: document usage of spacy train with --code -->
 | 
					<!-- TODO: document usage of spacy train with --code -->
 | 
				
			||||||
 | 
					<!-- TODO: link to type annotations and maybe show example: https://thinc.ai/docs/usage-config#advanced-types -->
 | 
				
			||||||
 | 
					
 | 
				
			||||||
## Transfer learning {#transfer-learning}
 | 
					## Transfer learning {#transfer-learning}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
| 
						 | 
					
 | 
				
			||||||
| 
						 | 
					@ -6,7 +6,7 @@
 | 
				
			||||||
    "siteUrlNightly": "https://nightly.spacy.io",
 | 
					    "siteUrlNightly": "https://nightly.spacy.io",
 | 
				
			||||||
    "nightlyBranches": ["nightly.spacy.io"],
 | 
					    "nightlyBranches": ["nightly.spacy.io"],
 | 
				
			||||||
    "email": "contact@explosion.ai",
 | 
					    "email": "contact@explosion.ai",
 | 
				
			||||||
    "company": "Explosion AI",
 | 
					    "company": "Explosion",
 | 
				
			||||||
    "companyUrl": "https://explosion.ai",
 | 
					    "companyUrl": "https://explosion.ai",
 | 
				
			||||||
    "repo": "explosion/spaCy",
 | 
					    "repo": "explosion/spaCy",
 | 
				
			||||||
    "modelsRepo": "explosion/spacy-models",
 | 
					    "modelsRepo": "explosion/spacy-models",
 | 
				
			||||||
| 
						 | 
					
 | 
				
			||||||
| 
						 | 
					@ -3,7 +3,7 @@
 | 
				
			||||||
    "private": true,
 | 
					    "private": true,
 | 
				
			||||||
    "description": "spaCy website",
 | 
					    "description": "spaCy website",
 | 
				
			||||||
    "version": "3.0.0",
 | 
					    "version": "3.0.0",
 | 
				
			||||||
    "author": "Explosion AI <contact@explosion.ai>",
 | 
					    "author": "Explosion <contact@explosion.ai>",
 | 
				
			||||||
    "license": "MIT",
 | 
					    "license": "MIT",
 | 
				
			||||||
    "dependencies": {
 | 
					    "dependencies": {
 | 
				
			||||||
        "@jupyterlab/outputarea": "^0.19.1",
 | 
					        "@jupyterlab/outputarea": "^0.19.1",
 | 
				
			||||||
| 
						 | 
					
 | 
				
			||||||
		Loading…
	
		Reference in New Issue
	
	Block a user