mirror of
https://github.com/explosion/spaCy.git
synced 2025-01-12 10:16:27 +03:00
Update v3 docs WIP [ci skip]
This commit is contained in:
parent
fa261d09e8
commit
a35236e5f0
|
@ -106,7 +106,7 @@ systems, or to pre-process text for **deep learning**.
|
||||||
|
|
||||||
- **spaCy is not a company**. It's an open-source library. Our company
|
- **spaCy is not a company**. It's an open-source library. Our company
|
||||||
publishing spaCy and other software is called
|
publishing spaCy and other software is called
|
||||||
[Explosion AI](https://explosion.ai).
|
[Explosion](https://explosion.ai).
|
||||||
|
|
||||||
## Features {#features}
|
## Features {#features}
|
||||||
|
|
||||||
|
|
|
@ -36,22 +36,19 @@ ready-to-use spaCy models.
|
||||||
The recommended way to train your spaCy models is via the
|
The recommended way to train your spaCy models is via the
|
||||||
[`spacy train`](/api/cli#train) command on the command line.
|
[`spacy train`](/api/cli#train) command on the command line.
|
||||||
|
|
||||||
1. The **training data** in spaCy's
|
1. The **training and evaluation data** in spaCy's
|
||||||
[binary format](/api/data-formats#binary-training) created using
|
[binary `.spacy` format](/api/data-formats#binary-training) created using
|
||||||
[`spacy convert`](/api/cli#convert).
|
[`spacy convert`](/api/cli#convert).
|
||||||
2. A `config.cfg` **configuration file** with all settings and hyperparameters.
|
2. A [`config.cfg`](#config) **configuration file** with all settings and
|
||||||
|
hyperparameters.
|
||||||
3. An optional **Python file** to register
|
3. An optional **Python file** to register
|
||||||
[custom models and architectures](#custom-models).
|
[custom models and architectures](#custom-models).
|
||||||
|
|
||||||
<!-- TODO: decide how we want to present the "getting started" workflow here, get a default config etc. -->
|
<!-- TODO: decide how we want to present the "getting started" workflow here, get a default config etc. -->
|
||||||
|
|
||||||
<Project id="some_example_project">
|
```bash
|
||||||
|
$ python -m spacy train train.spacy dev.spacy config.cfg --output ./output
|
||||||
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus interdum
|
```
|
||||||
sodales lectus, ut sodales orci ullamcorper id. Sed condimentum neque ut erat
|
|
||||||
mattis pretium.
|
|
||||||
|
|
||||||
</Project>
|
|
||||||
|
|
||||||
> #### Tip: Debug your data
|
> #### Tip: Debug your data
|
||||||
>
|
>
|
||||||
|
@ -60,9 +57,17 @@ mattis pretium.
|
||||||
> invalid entity annotations, cyclic dependencies, low data labels and more.
|
> invalid entity annotations, cyclic dependencies, low data labels and more.
|
||||||
>
|
>
|
||||||
> ```bash
|
> ```bash
|
||||||
> $ python -m spacy debug-data en train.json dev.json --verbose
|
> $ python -m spacy debug-data en train.spacy dev.spacy --verbose
|
||||||
> ```
|
> ```
|
||||||
|
|
||||||
|
<Project id="some_example_project">
|
||||||
|
|
||||||
|
The easiest way to get started with an end-to-end training process is to clone a
|
||||||
|
[project](/usage/projects) template. Projects let you manage multi-step
|
||||||
|
workflows, from data preprocessing to training and packaging your model.
|
||||||
|
|
||||||
|
</Project>
|
||||||
|
|
||||||
<Accordion title="Understanding the training output">
|
<Accordion title="Understanding the training output">
|
||||||
|
|
||||||
When you train a model using the [`spacy train`](/api/cli#train) command, you'll
|
When you train a model using the [`spacy train`](/api/cli#train) command, you'll
|
||||||
|
@ -94,7 +99,28 @@ still look good.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### Training config files {#cli}
|
### Training config files {#config}
|
||||||
|
|
||||||
|
> #### Migration from spaCy v2.x
|
||||||
|
>
|
||||||
|
> TODO: ...
|
||||||
|
|
||||||
|
Training config files include all **settings and hyperparameters** for training
|
||||||
|
your model. Instead of providing lots of arguments on the command line, you only
|
||||||
|
need to pass your `config.cfg` file to [`spacy train`](/api/cli#train).
|
||||||
|
|
||||||
|
To read more about how the config system works under the hood, check out the
|
||||||
|
[Thinc documentation](https://thinc.ai/docs/usage-config).
|
||||||
|
|
||||||
|
- **Structured sections.**
|
||||||
|
- **References to registered functions.** Sections can refer to registered
|
||||||
|
functions like [model architectures](/api/architectures),
|
||||||
|
[optimizers](https://thinc.ai/docs/api-optimizers) or
|
||||||
|
[schedules](https://thinc.ai/docs/api-schedules) and define arguments that are
|
||||||
|
passed into them. You can also register your own functions to define
|
||||||
|
[custom architectures](#custom-models), reference them in your config,
|
||||||
|
- **Interpolation.** If you have hyperparameters used by multiple components,
|
||||||
|
define them once and reference them as variables.
|
||||||
|
|
||||||
<!-- TODO: we need to come up with a good way to present the sections and their expected values visually? -->
|
<!-- TODO: we need to come up with a good way to present the sections and their expected values visually? -->
|
||||||
|
|
||||||
|
@ -174,6 +200,7 @@ mattis pretium.
|
||||||
### Training with custom code
|
### Training with custom code
|
||||||
|
|
||||||
<!-- TODO: document usage of spacy train with --code -->
|
<!-- TODO: document usage of spacy train with --code -->
|
||||||
|
<!-- TODO: link to type annotations and maybe show example: https://thinc.ai/docs/usage-config#advanced-types -->
|
||||||
|
|
||||||
## Transfer learning {#transfer-learning}
|
## Transfer learning {#transfer-learning}
|
||||||
|
|
||||||
|
|
|
@ -6,7 +6,7 @@
|
||||||
"siteUrlNightly": "https://nightly.spacy.io",
|
"siteUrlNightly": "https://nightly.spacy.io",
|
||||||
"nightlyBranches": ["nightly.spacy.io"],
|
"nightlyBranches": ["nightly.spacy.io"],
|
||||||
"email": "contact@explosion.ai",
|
"email": "contact@explosion.ai",
|
||||||
"company": "Explosion AI",
|
"company": "Explosion",
|
||||||
"companyUrl": "https://explosion.ai",
|
"companyUrl": "https://explosion.ai",
|
||||||
"repo": "explosion/spaCy",
|
"repo": "explosion/spaCy",
|
||||||
"modelsRepo": "explosion/spacy-models",
|
"modelsRepo": "explosion/spacy-models",
|
||||||
|
|
|
@ -3,7 +3,7 @@
|
||||||
"private": true,
|
"private": true,
|
||||||
"description": "spaCy website",
|
"description": "spaCy website",
|
||||||
"version": "3.0.0",
|
"version": "3.0.0",
|
||||||
"author": "Explosion AI <contact@explosion.ai>",
|
"author": "Explosion <contact@explosion.ai>",
|
||||||
"license": "MIT",
|
"license": "MIT",
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@jupyterlab/outputarea": "^0.19.1",
|
"@jupyterlab/outputarea": "^0.19.1",
|
||||||
|
|
Loading…
Reference in New Issue
Block a user