Update docs [ci skip]

This commit is contained in:
Ines Montani 2020-08-19 00:28:37 +02:00
parent c0f6e77a41
commit 13291e97ba
18 changed files with 295 additions and 183 deletions

View File

@ -39,8 +39,8 @@ the model name to be specified with its version (e.g. `en_core_web_sm-2.2.0`).
> to a local PyPi installation and fetching it straight from there. This will
> also allow you to add it as a versioned package dependency to your project.
```bash
$ python -m spacy download [model] [--direct] [pip args]
```cli
$ python -m spacy download [model] [--direct] [pip_args]
```
| Name | Description |
@ -57,11 +57,11 @@ Print information about your spaCy installation, models and local setup, and
generate [Markdown](https://en.wikipedia.org/wiki/Markdown)-formatted markup to
copy-paste into [GitHub issues](https://github.com/explosion/spaCy/issues).
```bash
```cli
$ python -m spacy info [--markdown] [--silent]
```
```bash
```cli
$ python -m spacy info [model] [--markdown] [--silent]
```
@ -88,7 +88,7 @@ and command for updating are shown.
> suite, to ensure all models are up to date before proceeding. If incompatible
> models are found, it will return `1`.
```bash
```cli
$ python -m spacy validate
```
@ -111,14 +111,14 @@ config. The settings you specify will impact the suggested model architectures
and pipeline setup, as well as the hyperparameters. You can also adjust and
customize those settings in your config file later.
> ```bash
> ### Example {wrap="true"}
> #### Example
>
> ```cli
> $ python -m spacy init config config.cfg --lang en --pipeline ner,textcat --optimize accuracy
> ```
```bash
$ python -m spacy init config [output_file] [--lang] [--pipeline]
[--optimize] [--cpu]
```cli
$ python -m spacy init config [output_file] [--lang] [--pipeline] [--optimize] [--cpu]
```
| Name | Description |
@ -143,12 +143,13 @@ be created, and their signatures are used to find the defaults. If your config
contains a problem that can't be resolved automatically, spaCy will show you a
validation error with more details.
> ```bash
> ### Example {wrap="true"}
> #### Example
>
> ```cli
> $ python -m spacy init fill-config base.cfg config.cfg
> ```
```bash
```cli
$ python -m spacy init fill-config [base_path] [output_file] [--diff]
```
@ -175,9 +176,8 @@ The `init-model` command is now available as a subcommand of `spacy init`.
</Infobox>
```bash
$ python -m spacy init model [lang] [output_dir] [--jsonl-loc] [--vectors-loc]
[--prune-vectors]
```cli
$ python -m spacy init model [lang] [output_dir] [--jsonl-loc] [--vectors-loc] [--prune-vectors]
```
| Name | Description |
@ -200,10 +200,8 @@ Convert files into spaCy's
management functions. The converter can be specified on the command line, or
chosen based on the file extension of the input file.
```bash
$ python -m spacy convert [input_file] [output_dir] [--converter]
[--file-type] [--n-sents] [--seg-sents] [--model] [--morphology]
[--merge-subtokens] [--ner-map] [--lang]
```cli
$ python -m spacy convert [input_file] [output_dir] [--converter] [--file-type] [--n-sents] [--seg-sents] [--model] [--morphology] [--merge-subtokens] [--ner-map] [--lang]
```
| Name | Description |
@ -246,13 +244,13 @@ errors at once and some issues are only shown once previous errors have been
fixed. To auto-fill a partial config and save the result, you can use the
[`init fillconfig`](/api/cli#init-fill-config) command.
```bash
```cli
$ python -m spacy debug config [config_path] [--code_path] [overrides]
```
> #### Example
>
> ```bash
> ```cli
> $ python -m spacy debug config ./config.cfg
> ```
@ -298,14 +296,13 @@ takes the same arguments as `train` and reads settings off the
</Infobox>
```bash
$ python -m spacy debug data [config_path] [--code] [--ignore-warnings]
[--verbose] [--no-format] [overrides]
```cli
$ python -m spacy debug data [config_path] [--code] [--ignore-warnings] [--verbose] [--no-format] [overrides]
```
> #### Example
>
> ```bash
> ```cli
> $ python -m spacy debug data ./config.cfg
> ```
@ -473,7 +470,7 @@ The `profile` command is now available as a subcommand of `spacy debug`.
</Infobox>
```bash
```cli
$ python -m spacy debug profile [model] [inputs] [--n-texts]
```
@ -490,9 +487,8 @@ $ python -m spacy debug profile [model] [inputs] [--n-texts]
Debug a Thinc [`Model`](https://thinc.ai/docs/api-model) by running it on a
sample text and checking how it updates its internal weights and parameters.
```bash
$ python -m spacy debug model [config_path] [component] [--layers] [-DIM]
[-PAR] [-GRAD] [-ATTR] [-P0] [-P1] [-P2] [P3] [--gpu-id]
```cli
$ python -m spacy debug model [config_path] [component] [--layers] [-DIM] [-PAR] [-GRAD] [-ATTR] [-P0] [-P1] [-P2] [P3] [--gpu-id]
```
<Accordion title="Example outputs" spaced>
@ -502,7 +498,7 @@ model ("Step 0"), which helps us to understand the internal structure of the
Neural Network, and to focus on specific layers that we want to inspect further
(see next example).
```bash
```cli
$ python -m spacy debug model ./config.cfg tagger -P0
```
@ -548,7 +544,7 @@ an all-zero matrix determined by the `nO` and `nI` dimensions. After a first
training step (Step 2), this matrix has clearly updated its values through the
training feedback loop.
```bash
```cli
$ python -m spacy debug model ./config.cfg tagger -l "5,15" -DIM -PAR -P0 -P1 -P2
```
@ -632,7 +628,7 @@ in the section `[paths]`.
</Infobox>
```bash
```cli
$ python -m spacy train [config_path] [--output] [--code] [--verbose] [overrides]
```
@ -669,9 +665,8 @@ the [data format](/api/data-formats#config) for details.
</Infobox>
```bash
$ python -m spacy pretrain [texts_loc] [output_dir] [config_path]
[--code] [--resume-path] [--epoch-resume] [overrides]
```cli
$ python -m spacy pretrain [texts_loc] [output_dir] [config_path] [--code] [--resume-path] [--epoch-resume] [overrides]
```
| Name | Description |
@ -698,9 +693,8 @@ skew. To render a sample of dependency parses in a HTML file using the
[displaCy visualizations](/usage/visualizers), set as output directory as the
`--displacy-path` argument.
```bash
$ python -m spacy evaluate [model] [data_path] [--output] [--gold-preproc]
[--gpu-id] [--displacy-path] [--displacy-limit]
```cli
$ python -m spacy evaluate [model] [data_path] [--output] [--gold-preproc] [--gpu-id] [--displacy-path] [--displacy-limit]
```
| Name | Description |
@ -733,17 +727,16 @@ this, you can set the `--no-sdist` flag.
</Infobox>
```bash
$ python -m spacy package [input_dir] [output_dir] [--meta-path] [--create-meta]
[--no-sdist] [--version] [--force]
```cli
$ python -m spacy package [input_dir] [output_dir] [--meta-path] [--create-meta] [--no-sdist] [--version] [--force]
```
> #### Example
>
> ```bash
> python -m spacy package /input /output
> cd /output/en_model-0.0.0
> pip install dist/en_model-0.0.0.tar.gz
> ```cli
> $ python -m spacy package /input /output
> $ cd /output/en_model-0.0.0
> $ pip install dist/en_model-0.0.0.tar.gz
> ```
| Name | Description |
@ -775,19 +768,19 @@ can provide any other repo (public or private) that you have access to using the
<!-- TODO: update example once we've decided on repo structure -->
```bash
```cli
$ python -m spacy project clone [name] [dest] [--repo]
```
> #### Example
>
> ```bash
> ```cli
> $ python -m spacy project clone some_example
> ```
>
> Clone from custom repo:
>
> ```bash
> ```cli
> $ python -m spacy project clone template --repo https://github.com/your_org/your_repo
> ```
@ -810,13 +803,13 @@ considered "private" and you have to take care of putting them into the
destination directory yourself. If a local path is provided, the asset is copied
into the current project.
```bash
```cli
$ python -m spacy project assets [project_dir]
```
> #### Example
>
> ```bash
> ```cli
> $ python -m spacy project assets
> ```
@ -835,13 +828,13 @@ all commands in the workflow are run, in order. If commands define
re-run if state has changed. For example, if the input dataset changes, a
preprocessing command that depends on those files will be re-run.
```bash
```cli
$ python -m spacy project run [subcommand] [project_dir] [--force] [--dry]
```
> #### Example
>
> ```bash
> ```cli
> $ python -m spacy project run train
> ```
@ -874,16 +867,16 @@ You'll also need to add the assets you want to track with
</Infobox>
```bash
```cli
$ python -m spacy project dvc [project_dir] [workflow] [--force] [--verbose]
```
> #### Example
>
> ```bash
> git init
> dvc init
> python -m spacy project dvc all
> ```cli
> $ git init
> $ dvc init
> $ python -m spacy project dvc all
> ```
| Name | Description |

View File

@ -118,8 +118,8 @@ need paths, you can define them here. All config values can also be
[`spacy train`](/api/cli#train), which is especially relevant for data paths
that you don't want to hard-code in your config file.
```bash
$ python -m spacy train ./config.cfg --paths.train ./corpus/train.spacy
```cli
$ python -m spacy train config.cfg --paths.train ./corpus/train.spacy
```
### training {#config-training tag="section"}
@ -209,8 +209,8 @@ objects to JSON, you can now serialize them directly using the
[`spacy convert`](/api/cli) lets you convert your JSON data to the new `.spacy`
format:
```bash
$ python -m spacy convert ./data.json ./output
```cli
$ python -m spacy convert ./data.json ./output.spacy
```
</Infobox>

View File

@ -110,9 +110,9 @@ in `/opt/nvidia/cuda`, you would run:
```bash
### Installation with CUDA
export CUDA_PATH="/opt/nvidia/cuda"
pip install cupy-cuda102
pip install spacy-transformers
$ export CUDA_PATH="/opt/nvidia/cuda"
$ pip install cupy-cuda102
$ pip install spacy-transformers
```
### Runtime usage {#transformers-runtime}
@ -130,7 +130,7 @@ The `Transformer` component sets the
[`Doc._.trf_data`](/api/transformer#custom_attributes) extension attribute,
which lets you access the transformers outputs at runtime.
```bash
```cli
$ python -m spacy download en_core_trf_lg
```
@ -292,8 +292,8 @@ function. You can make it available via the `--code` argument that can point to
a Python file. For more details on training with custom code, see the
[training documentation](/usage/training#custom-code).
```bash
$ python -m spacy train ./config.cfg --code ./code.py
```cli
python -m spacy train ./config.cfg --code ./code.py
```
### Customizing the model implementations {#training-custom-model}

View File

@ -40,7 +40,7 @@ $ pip install -U spacy
> After installation you need to download a language model. For more info and
> available models, see the [docs on models](/models).
>
> ```bash
> ```cli
> $ python -m spacy download en_core_web_sm
>
> >>> import spacy
@ -62,9 +62,9 @@ When using pip it is generally recommended to install packages in a virtual
environment to avoid modifying system state:
```bash
python -m venv .env
source .env/bin/activate
pip install spacy
$ python -m venv .env
$ source .env/bin/activate
$ pip install spacy
```
### conda {#conda}
@ -106,9 +106,9 @@ links created in different virtual environments. It's recommended to run the
command with `python -m` to make sure you're executing the correct version of
spaCy.
```bash
pip install -U spacy
python -m spacy validate
```cli
$ pip install -U spacy
$ python -m spacy validate
```
### Run spaCy with GPU {#gpu new="2.0.14"}
@ -156,15 +156,15 @@ system. See notes on [Ubuntu](#source-ubuntu), [macOS / OS X](#source-osx) and
[Windows](#source-windows) for details.
```bash
python -m pip install -U pip # update pip
git clone https://github.com/explosion/spaCy # clone spaCy
cd spaCy # navigate into directory
$ python -m pip install -U pip # update pip
$ git clone https://github.com/explosion/spaCy # clone spaCy
$ cd spaCy # navigate into dir
python -m venv .env # create environment in .env
source .env/bin/activate # activate virtual environment
\export PYTHONPATH=`pwd` # set Python path to spaCy directory
pip install -r requirements.txt # install all requirements
python setup.py build_ext --inplace # compile spaCy
$ python -m venv .env # create environment in .env
$ source .env/bin/activate # activate virtual env
$ export PYTHONPATH=`pwd` # set Python path to spaCy dir
$ pip install -r requirements.txt # install all requirements
$ python setup.py build_ext --inplace # compile spaCy
```
Compared to regular install via pip, the
@ -209,20 +209,18 @@ that directory. Don't forget to also install the test utilities via spaCy's
[`requirements.txt`](https://github.com/explosion/spaCy/tree/master/requirements.txt):
```bash
python -c "import os; import spacy; print(os.path.dirname(spacy.__file__))"
pip install -r path/to/requirements.txt
python -m pytest [spacy directory]
$ python -c "import os; import spacy; print(os.path.dirname(spacy.__file__))"
$ pip install -r path/to/requirements.txt
$ python -m pytest [spacy directory]
```
Calling `pytest` on the spaCy directory will run only the basic tests. The flag
`--slow` is optional and enables additional tests that take longer.
```bash
# make sure you are using recent pytest version
python -m pip install -U pytest
python -m pytest [spacy directory] # basic tests
python -m pytest [spacy directory] --slow # basic and slow tests
$ python -m pip install -U pytest # update pytest
$ python -m pytest [spacy directory] # basic tests
$ python -m pytest [spacy directory] --slow # basic and slow tests
```
## Troubleshooting guide {#troubleshooting}
@ -283,7 +281,7 @@ only 65535 in a narrow unicode build. You can check this by running the
following command:
```bash
python -c "import sys; print(sys.maxunicode)"
$ python -c "import sys; print(sys.maxunicode)"
```
If you're running a narrow unicode build, reinstall Python and use a wide
@ -305,8 +303,8 @@ run `source ~/.bash_profile` or `source ~/.zshrc`. Make sure to add **both
lines** for `LC_ALL` and `LANG`.
```bash
\export LC_ALL=en_US.UTF-8
\export LANG=en_US.UTF-8
$ export LC_ALL=en_US.UTF-8
$ export LANG=en_US.UTF-8
```
</Accordion>

View File

@ -1588,9 +1588,9 @@ some nice Latin vectors. You can then pass the directory path to
> doc1.similarity(doc2)
> ```
```bash
wget https://s3-us-west-1.amazonaws.com/fasttext-vectors/word-vectors-v2/cc.la.300.vec.gz
python -m spacy init model en /tmp/la_vectors_wiki_lg --vectors-loc cc.la.300.vec.gz
```cli
$ wget https://s3-us-west-1.amazonaws.com/fasttext-vectors/word-vectors-v2/cc.la.300.vec.gz
$ python -m spacy init model en /tmp/la_vectors_wiki_lg --vectors-loc cc.la.300.vec.gz
```
<Accordion title="How to optimize vector coverage" id="custom-vectors-coverage" spaced>
@ -1649,8 +1649,8 @@ the vector of "leaving", which is identical. If you're using the
option to easily reduce the size of the vectors as you add them to a spaCy
model:
```bash
$ python -m spacy init model /tmp/la_vectors_web_md --vectors-loc la.300d.vec.tgz --prune-vectors 10000
```cli
$ python -m spacy init model en /tmp/la_vectors_web_md --vectors-loc la.300d.vec.tgz --prune-vectors 10000
```
This will create a spaCy model with vectors for the first 10,000 words in the
@ -1741,9 +1741,8 @@ language name, and even train models with it and refer to it in your
> needs to be available during training. You can load a Python file containing
> the code using the `--code` argument:
>
> ```bash
> ### {wrap="true"}
> $ python -m spacy train config.cfg --code code.py
> ```cli
> python -m spacy train config.cfg --code code.py
> ```
```python

View File

@ -116,15 +116,10 @@ The Chinese language class supports three word segmentation options:
<Infobox variant="warning">
In spaCy v3, the default Chinese word segmenter has switched from Jieba to
character segmentation.
</Infobox>
<Infobox variant="warning">
Note that [`pkuseg`](https://github.com/lancopku/pkuseg-python) doesn't yet ship
with pre-compiled wheels for Python 3.8. If you're running Python 3.8, you can
In spaCy v3.0, the default Chinese word segmenter has switched from Jieba to
character segmentation. Also note that
[`pkuseg`](https://github.com/lancopku/pkuseg-python) doesn't yet ship with
pre-compiled wheels for Python 3.8. If you're running Python 3.8, you can
install it from our fork and compile it locally:
```bash
@ -174,7 +169,7 @@ nlp.tokenizer.pkuseg_update_user_dict([], reset=True)
</Accordion>
<Accordion title="Details on pretrained and custom Chinese models">
<Accordion title="Details on pretrained and custom Chinese models" spaced>
The [Chinese models](/models/zh) provided by spaCy include a custom `pkuseg`
model trained only on
@ -247,20 +242,20 @@ best-matching model compatible with your spaCy installation.
> + nlp = spacy.load("en_core_web_sm")
> ```
```bash
# Download best-matching version of specific model for your spaCy installation
python -m spacy download en_core_web_sm
```cli
# Download best-matching version of a model for your spaCy installation
$ python -m spacy download en_core_web_sm
# Download exact model version
python -m spacy download en_core_web_sm-2.2.0 --direct
$ python -m spacy download en_core_web_sm-3.0.0 --direct
```
The download command will [install the model](/usage/models#download-pip) via
pip and place the package in your `site-packages` directory.
```bash
pip install spacy
python -m spacy download en_core_web_sm
```cli
$ pip install -U spacy
$ python -m spacy download en_core_web_sm
```
```python
@ -279,10 +274,10 @@ click on the archive link and copy it to your clipboard.
```bash
# With external URL
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.0.0/en_core_web_sm-3.0.0.tar.gz
$ pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.0.0/en_core_web_sm-3.0.0.tar.gz
# With local file
pip install /Users/you/en_core_web_sm-3.0.0.tar.gz
$ pip install /Users/you/en_core_web_sm-3.0.0.tar.gz
```
By default, this will install the model into your `site-packages` directory. You
@ -305,7 +300,7 @@ archive consists of a model directory that contains another directory with the
model data.
```yaml
### Directory structure {highlight="7"}
### Directory structure {highlight="6"}
└── en_core_web_md-3.0.0.tar.gz # downloaded archive
├── setup.py # setup file for pip installation
├── meta.json # copy of model meta

View File

@ -67,8 +67,8 @@ project template and copies the files to a local directory. You can then run the
project, e.g. to train a model and edit the commands and scripts to build fully
custom workflows.
```bash
$ python -m spacy clone some_example_project
```cli
python -m spacy project clone some_example_project
```
By default, the project will be cloned into the current working directory. You
@ -95,9 +95,9 @@ to download and where to put them. The
[`spacy project assets`](/api/cli#project-assets) will fetch the project assets
for you:
```bash
cd some_example_project
python -m spacy project assets
```
$ cd some_example_project
$ python -m spacy project assets
```
### 3. Run a command {#run}
@ -123,7 +123,7 @@ Commands consist of one or more steps and can be run with
[`spacy project run`](/api/cli#project-run). The following will run the command
`preprocess` defined in the `project.yml`:
```bash
```cli
$ python -m spacy project run preprocess
```
@ -156,7 +156,7 @@ to turn the best model artifact into an installable Python package. The
following command run the workflow named `all` defined in the `project.yml`, and
execute the commands it specifies, in order:
```bash
```cli
$ python -m spacy project run all
```
@ -379,8 +379,8 @@ The [`spacy project clone`](/api/cli#project-clone) command lets you customize
the repo to clone from using the `--repo` option. It calls into `git`, so you'll
be able to clone from any repo that you have access to, including private repos.
```bash
$ python -m spacy project your_project --repo https://github.com/you/repo
```cli
python -m spacy project clone your_project --repo https://github.com/you/repo
```
At a minimum, a valid project template needs to contain a
@ -445,9 +445,9 @@ to include support for remote storage like Google Cloud Storage, S3, Azure, SSH
and more.
```bash
pip install dvc # Install DVC
git init # Initialize a Git repo
dvc init # Initialize a DVC project
$ pip install dvc # Install DVC
$ git init # Initialize a Git repo
$ dvc init # Initialize a DVC project
```
<Infobox title="Important note on privacy" variant="warning">
@ -466,8 +466,8 @@ can then manage your spaCy project like any other DVC project, run
and [`dvc repro`](https://dvc.org/doc/command-reference/repro) to reproduce the
workflow or individual commands.
```bash
$ python -m spacy project dvc [workflow name]
```cli
$ python -m spacy project dvc [workflow_name]
```
<Infobox title="Important note for multiple workflows" variant="warning">
@ -508,7 +508,7 @@ and evaluation set.
> #### Example usage
>
> ```bash
> ```cli
> $ python -m spacy project run annotate
> ```
@ -595,7 +595,7 @@ spacy_streamlit.visualize(MODELS, DEFAULT_TEXT, visualizers=["ner"])
> #### Example usage
>
> ```bash
> ```cli
> $ python -m spacy project run visualize
> ```
@ -636,8 +636,8 @@ API.
> #### Example usage
>
> ```bash
> $ python -m spacy project run visualize
> ```cli
> $ python -m spacy project run serve
> ```
<!-- prettier-ignore -->

View File

@ -562,11 +562,11 @@ import DisplaCyEntSnekHtml from 'images/displacy-ent-snek.html'
## Saving, loading and distributing models {#models}
After training your model, you'll usually want to save its state, and load it
back later. You can do this with the
[`Language.to_disk()`](/api/language#to_disk) method:
back later. You can do this with the [`Language.to_disk`](/api/language#to_disk)
method:
```python
nlp.to_disk('/home/me/data/en_example_model')
nlp.to_disk("./en_example_model")
```
The directory will be created if it doesn't exist, and the whole pipeline data,
@ -629,8 +629,8 @@ docs.
> }
> ```
```bash
$ python -m spacy package /home/me/data/en_example_model /home/me/my_models
```cli
$ python -m spacy package ./en_example_model ./my_models
```
This command will create a model package directory and will run

View File

@ -160,7 +160,7 @@ the website or company in a specific context.
> #### Loading models
>
> ```bash
> ```cli
> $ python -m spacy download en_core_web_sm
>
> >>> import spacy

View File

@ -66,7 +66,7 @@ the [`init fill-config`](/api/cli#init-fill-config) command to fill in the
remaining defaults. Training configs should always be **complete and without
hidden defaults**, to keep your experiments reproducible.
```bash
```cli
$ python -m spacy init fill-config base_config.cfg config.cfg
```
@ -76,8 +76,8 @@ $ python -m spacy init fill-config base_config.cfg config.cfg
> your training and development data, get useful stats, and find problems like
> invalid entity annotations, cyclic dependencies, low data labels and more.
>
> ```bash
> $ python -m spacy debug data config.cfg --verbose
> ```cli
> $ python -m spacy debug data config.cfg
> ```
Instead of exporting your starter config from the quickstart widget and
@ -88,7 +88,7 @@ add your data and run [`train`](/api/cli#train) with your config. See the
spaCy's binary `.spacy` format. You can either include the data paths in the
`[paths]` section of your config, or pass them in via the command line.
```bash
```cli
$ python -m spacy train config.cfg --output ./output --paths.train ./train.spacy --paths.dev ./dev.spacy
```
@ -186,9 +186,8 @@ For cases like this, you can set additional command-line options starting with
`--paths.train ./corpus/train.spacy` sets the `train` value in the `[paths]`
block.
```bash
$ python -m spacy train config.cfg --paths.train ./corpus/train.spacy
--paths.dev ./corpus/dev.spacy --training.batch_size 128
```cli
$ python -m spacy train config.cfg --paths.train ./corpus/train.spacy --paths.dev ./corpus/dev.spacy --training.batch_size 128
```
Only existing sections and values in the config can be overwritten. At the end
@ -486,8 +485,9 @@ still look good.
### Training with custom code {#custom-code}
> ```bash
> ### Example {wrap="true"}
> #### Example
>
> ```cli
> $ python -m spacy train config.cfg --code functions.py
> ```
@ -605,9 +605,8 @@ you can now run [`spacy train`](/api/cli#train) and point the argument `--code`
to your Python file. Before loading the config, spaCy will import the
`functions.py` module and your custom functions will be registered.
```bash
### Training with custom code {wrap="true"}
python -m spacy train config.cfg --output ./output --code ./functions.py
```cli
$ python -m spacy train config.cfg --output ./output --code ./functions.py
```
#### Example: Custom batch size schedule {#custom-code-schedule}

View File

@ -212,14 +212,15 @@ Note that spaCy v3.0 now requires **Python 3.6+**.
### Removed or renamed API {#incompat-removed}
| Removed | Replacement |
| -------------------------------------------------------- | ----------------------------------------------------- |
| `Language.disable_pipes` | [`Language.select_pipes`](/api/language#select_pipes) |
| `GoldParse` | [`Example`](/api/example) |
| `GoldCorpus` | [`Corpus`](/api/corpus) |
| `spacy debug-data` | [`spacy debug data`](/api/cli#debug-data) |
| `spacy profile` | [`spacy debug profile`](/api/cli#debug-profile) |
| `spacy link`, `util.set_data_path`, `util.get_data_path` | not needed, model symlinks are deprecated |
| Removed | Replacement |
| ------------------------------------------------------ | ----------------------------------------------------------------------------------------- |
| `Language.disable_pipes` | [`Language.select_pipes`](/api/language#select_pipes) |
| `GoldParse` | [`Example`](/api/example) |
| `GoldCorpus` | [`Corpus`](/api/corpus) |
| `KnowledgeBase.load_bulk` `KnowledgeBase.dump` | [`KnowledgeBase.from_disk`](/api/kb#from_disk) [`KnowledgeBase.to_disk`](/api/kb#to_disk) |
| `spacy debug-data` | [`spacy debug data`](/api/cli#debug-data) |
| `spacy profile` | [`spacy debug profile`](/api/cli#debug-profile) |
| `spacy link` `util.set_data_path` `util.get_data_path` | not needed, model symlinks are deprecated |
The following deprecated methods, attributes and arguments were removed in v3.0.
Most of them have been **deprecated for a while** and many would previously
@ -412,12 +413,11 @@ spaCy v3.0 uses a new
serializing a [`DocBin`](/api/docbin), which represents a collection of `Doc`
objects. This means that you can train spaCy models using the same format it
outputs: annotated `Doc` objects. The binary format is extremely **efficient in
storage**, especially when packing multiple documents together.
storage**, especially when packing multiple documents together. You can convert
your existing JSON-formatted data using the [`spacy convert`](/api/cli#convert)
command, which outputs `.spacy` files:
You can convert your existing JSON-formatted data using the
[`spacy convert`](/api/cli#convert) command, which outputs `.spacy` files:
```bash
```cli
$ python -m spacy convert ./training.json ./output
```
@ -429,7 +429,7 @@ The easiest way to get started with a training config is to use the
requirements, and it will auto-generate a starter config with the best-matching
default settings.
```bash
```cli
$ python -m spacy init config ./config.cfg --lang en --pipeline tagger,parser
```

View File

@ -8,7 +8,7 @@ import { window } from 'browser-monads'
import CUSTOM_TYPES from '../../meta/type-annotations.json'
import { isString, htmlToReact } from './util'
import Link from './link'
import Link, { OptionalLink } from './link'
import GitHubCode from './github'
import classes from '../styles/code.module.sass'
@ -89,6 +89,91 @@ export const TypeAnnotation = ({ lang = 'python', link = true, children }) => {
)
}
function replacePrompt(line, prompt, isFirst = false) {
let result = line
const hasPrompt = result.startsWith(`${prompt} `)
const showPrompt = hasPrompt || isFirst
if (hasPrompt) result = result.slice(2)
return result && showPrompt ? `<span data-prompt="${prompt}">${result}</span>` : result
}
function parseArgs(raw) {
const commandGroups = ['init', 'debug', 'project']
let args = raw.split(' ').filter(arg => arg)
const result = {}
while (args.length) {
let opt = args.shift()
if (opt.length > 1 && opt.startsWith('-')) {
const isFlag = !args.length || (args[0].length > 1 && args[0].startsWith('-'))
result[opt] = isFlag ? true : args.shift()
} else {
const key = commandGroups.includes(opt) ? `${opt} ${args.shift()}` : opt
result[key] = null
}
}
return result
}
function formatCode(html, lang, prompt) {
if (lang === 'cli') {
const cliRegex = /^(\$ )?python -m spacy/
const lines = html
.trim()
.split('\n')
.map((line, i) => {
if (cliRegex.test(line)) {
const text = line.replace(cliRegex, '')
const args = parseArgs(text)
const cmd = Object.keys(args).map((key, i) => {
const value = args[key]
return value === null || value === true || i === 0 ? key : `${key} ${value}`
})
return (
<Fragment key={i}>
<span data-prompt="$" className={classes.cliArgSubtle}>
python -m
</span>{' '}
<span>spacy</span>{' '}
{cmd.map((item, j) => {
const isCmd = j === 0
const url = isCmd ? `/api/cli#${item.replace(' ', '-')}` : null
const isAbstract = isString(item) && /^\[(.+)\]$/.test(item)
const itemClassNames = classNames(classes.cliArg, {
[classes.cliArgHighlight]: isCmd,
[classes.cliArgEmphasis]: isAbstract,
})
const text = isAbstract ? item.slice(1, -1) : item
return (
<Fragment key={j}>
{j !== 0 && ' '}
<span className={itemClassNames}>
<OptionalLink hidden hideIcon to={url}>
{text}
</OptionalLink>
</span>
</Fragment>
)
})}
</Fragment>
)
}
const htmlLine = replacePrompt(highlightCode('bash', line), '$')
return htmlToReact(htmlLine)
})
return lines.map((line, i) => (
<Fragment key={i}>
{i !== 0 && <br />}
{line}
</Fragment>
))
}
const result = html
.split('\n')
.map((line, i) => (prompt ? replacePrompt(line, prompt, i === 0) : line))
.join('\n')
return htmlToReact(result)
}
export class Code extends React.Component {
state = { Juniper: null }
@ -136,7 +221,8 @@ export class Code extends React.Component {
children,
} = this.props
const codeClassNames = classNames(classes.code, className, `language-${lang}`, {
[classes.wrap]: !!highlight || !!wrap,
[classes.wrap]: !!highlight || !!wrap || lang === 'cli',
[classes.cli]: lang === 'cli',
})
const ghClassNames = classNames(codeClassNames, classes.maxHeight)
const { Juniper } = this.state
@ -154,14 +240,14 @@ export class Code extends React.Component {
const codeText = Array.isArray(children) ? children.join('') : children || ''
const highlightRange = highlight ? rangeParser.parse(highlight).filter(n => n > 0) : []
const html = lang === 'none' ? codeText : highlightCode(lang, codeText, highlightRange)
const rawHtml = ['none', 'cli'].includes(lang)
? codeText
: highlightCode(lang, codeText, highlightRange)
const html = formatCode(rawHtml, lang, prompt)
return (
<>
{title && <h4 className={classes.title}>{title}</h4>}
<code className={codeClassNames} data-prompt={prompt}>
{htmlToReact(html)}
</code>
<code className={codeClassNames}>{html}</code>
</>
)
}

View File

@ -117,7 +117,7 @@ const Quickstart = ({
{help && (
<span data-tooltip={help} className={classes.help}>
{' '}
<Icon name="help" width={16} spaced />
<Icon name="help" width={16} />
</span>
)}
</div>
@ -201,7 +201,7 @@ const Quickstart = ({
className={classes.help}
>
{' '}
<Icon name="help" width={16} spaced />
<Icon name="help" width={16} />
</span>
)}
</label>

Binary file not shown.

Binary file not shown.

View File

@ -28,7 +28,7 @@ $border-radius: 6px
margin-top: 0 !important
code
padding: 0
padding: 0 !important
margin: 0
h4

View File

@ -27,7 +27,7 @@
padding: 1.75em 1.5em
.code
&[data-prompt]:before,
&[data-prompt]:before, span[data-prompt]:before
content: attr(data-prompt)
margin-right: 0.65em
display: inline-block
@ -163,3 +163,31 @@
font-weight: normal
padding-top: 0.1rem
color: var(--color-subtle-dark)
.cli
padding-top: calc(var(--spacing-sm) - 6px)
padding-bottom: calc(var(--spacing-sm) - 12px)
[data-prompt]:before
color: var(--color-subtle)
.cli-arg
border: 1px solid var(--color-dark)
padding: 1px 6px
margin-bottom: 5px
border-radius: 0.5em
display: inline-block
a
color: inherit !important
.cli-arg-highlight
background: var(--color-theme)
border-color: var(--color-theme)
color: var(--color-back) !important
.cli-arg-subtle
color: var(--syntax-comment)
.cli-arg-emphasis
font-style: italic

View File

@ -157,6 +157,14 @@
font-display: fallback
src: url("../fonts/jetbrainsmono-regular.woff") format("woff"), url("../fonts/jetbrainsmono-regular.woff2") format("woff2")
@font-face
font-family: "JetBrains Mono"
font-style: italic
font-weight: 500
font-display: fallback
src: url("../fonts/jetbrainsmono-italic.woff") format("woff"), url("../fonts/jetbrainsmono-italic.woff2") format("woff2")
/* Reset */
*, *:before, *:after
@ -366,6 +374,12 @@ body [id]:target
&.operator
color: var(--syntax-comment)
[class*="language-bash"] .token
&.function
color: var(--color-subtle)
&.operator, &.variable
color: var(--syntax-comment)
// Settings for ini syntax (config files)
[class*="language-ini"]