mirror of
https://github.com/explosion/spaCy.git
synced 2024-11-11 04:08:09 +03:00
Update docs [ci skip]
This commit is contained in:
parent
37c3bb35e2
commit
dc8c9d912f
|
@ -611,28 +611,6 @@ detecting the IPython kernel. Mainly used for the
|
||||||
| ----------- | ---- | ------------------------------------- |
|
| ----------- | ---- | ------------------------------------- |
|
||||||
| **RETURNS** | bool | `True` if in Jupyter, `False` if not. |
|
| **RETURNS** | bool | `True` if in Jupyter, `False` if not. |
|
||||||
|
|
||||||
### util.update_exc {#util.update_exc tag="function"}
|
|
||||||
|
|
||||||
Update, validate and overwrite
|
|
||||||
[tokenizer exceptions](/usage/adding-languages#tokenizer-exceptions). Used to
|
|
||||||
combine global exceptions with custom, language-specific exceptions. Will raise
|
|
||||||
an error if key doesn't match `ORTH` values.
|
|
||||||
|
|
||||||
> #### Example
|
|
||||||
>
|
|
||||||
> ```python
|
|
||||||
> BASE = {"a.": [{ORTH: "a."}], ":)": [{ORTH: ":)"}]}
|
|
||||||
> NEW = {"a.": [{ORTH: "a.", NORM: "all"}]}
|
|
||||||
> exceptions = util.update_exc(BASE, NEW)
|
|
||||||
> # {"a.": [{ORTH: "a.", NORM: "all"}], ":)": [{ORTH: ":)"}]}
|
|
||||||
> ```
|
|
||||||
|
|
||||||
| Name | Type | Description |
|
|
||||||
| ----------------- | ----- | --------------------------------------------------------------- |
|
|
||||||
| `base_exceptions` | dict | Base tokenizer exceptions. |
|
|
||||||
| `*addition_dicts` | dicts | Exception dictionaries to add to the base exceptions, in order. |
|
|
||||||
| **RETURNS** | dict | Combined tokenizer exceptions. |
|
|
||||||
|
|
||||||
### util.compile_prefix_regex {#util.compile_prefix_regex tag="function"}
|
### util.compile_prefix_regex {#util.compile_prefix_regex tag="function"}
|
||||||
|
|
||||||
Compile a sequence of prefix rules into a regex object.
|
Compile a sequence of prefix rules into a regex object.
|
||||||
|
|
|
@ -29,8 +29,7 @@ import QuickstartInstall from 'widgets/quickstart-install.js'
|
||||||
|
|
||||||
### pip {#pip}
|
### pip {#pip}
|
||||||
|
|
||||||
Using pip, spaCy releases are available as source packages and binary wheels (as
|
Using pip, spaCy releases are available as source packages and binary wheels.
|
||||||
of v2.0.13).
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
$ pip install -U spacy
|
$ pip install -U spacy
|
||||||
|
@ -50,8 +49,8 @@ $ pip install -U spacy
|
||||||
|
|
||||||
<Infobox variant="warning">
|
<Infobox variant="warning">
|
||||||
|
|
||||||
To install additional data tables for lemmatization in **spaCy v2.2+** you can
|
To install additional data tables for lemmatization you can run
|
||||||
run `pip install spacy[lookups]` or install
|
`pip install spacy[lookups]` or install
|
||||||
[`spacy-lookups-data`](https://github.com/explosion/spacy-lookups-data)
|
[`spacy-lookups-data`](https://github.com/explosion/spacy-lookups-data)
|
||||||
separately. The lookups package is needed to create blank models with
|
separately. The lookups package is needed to create blank models with
|
||||||
lemmatization data, and to lemmatize in languages that don't yet come with
|
lemmatization data, and to lemmatize in languages that don't yet come with
|
||||||
|
|
|
@ -1353,6 +1353,8 @@ print("After:", [(token.text, token._.is_musician) for token in doc])
|
||||||
|
|
||||||
## Sentence Segmentation {#sbd}
|
## Sentence Segmentation {#sbd}
|
||||||
|
|
||||||
|
<!-- TODO: include senter -->
|
||||||
|
|
||||||
A [`Doc`](/api/doc) object's sentences are available via the `Doc.sents`
|
A [`Doc`](/api/doc) object's sentences are available via the `Doc.sents`
|
||||||
property. Unlike other libraries, spaCy uses the dependency parse to determine
|
property. Unlike other libraries, spaCy uses the dependency parse to determine
|
||||||
sentence boundaries. This is usually more accurate than a rule-based approach,
|
sentence boundaries. This is usually more accurate than a rule-based approach,
|
||||||
|
|
Loading…
Reference in New Issue
Block a user