mirror of
https://github.com/explosion/spaCy.git
synced 2025-08-04 12:20:20 +03:00
Website updates for v3-5 draft
This commit is contained in:
parent
283067ef35
commit
3a300b0962
|
@ -6,13 +6,13 @@ menu:
|
||||||
- ['Upgrading Notes', 'upgrading']
|
- ['Upgrading Notes', 'upgrading']
|
||||||
---
|
---
|
||||||
|
|
||||||
## New features {#features hidden="true"}
|
## New features {id="features",hidden="true"}
|
||||||
|
|
||||||
spaCy v3.5 introduces three new CLI commands, `apply`, `benchmark` and
|
spaCy v3.5 introduces three new CLI commands, `apply`, `benchmark` and
|
||||||
`find-threshold`, provides improvements and extensions to our entity linking
|
`find-threshold`, provides improvements and extensions to our entity linking
|
||||||
functionality, XXX
|
functionality, XXX
|
||||||
|
|
||||||
### New CLI commands {#cli}
|
### New CLI commands {id="cli"}
|
||||||
|
|
||||||
TODO `apply`
|
TODO `apply`
|
||||||
|
|
||||||
|
@ -20,16 +20,16 @@ TODO `benchmark`
|
||||||
|
|
||||||
TODO `find-threshold`
|
TODO `find-threshold`
|
||||||
|
|
||||||
### Fuzzy matching {#fuzzy}
|
### Fuzzy matching {id="fuzzy"}
|
||||||
|
|
||||||
TODO
|
TODO
|
||||||
|
|
||||||
### Entity linking generalization {#el}
|
### Entity linking generalization {id="el"}
|
||||||
|
|
||||||
The knowledge base used for entity linking is now easier to customize and has a
|
The knowledge base used for entity linking is now easier to customize and has a
|
||||||
new default implementation [`InMemoryLookupKB`](/api/kb_in_memory).
|
new default implementation [`InMemoryLookupKB`](/api/kb_in_memory).
|
||||||
|
|
||||||
### Additional features and improvements {#additional-features-and-improvements}
|
### Additional features and improvements {id="additional-features-and-improvements"}
|
||||||
|
|
||||||
- Language updates:
|
- Language updates:
|
||||||
- Extended support for Slovenian.
|
- Extended support for Slovenian.
|
||||||
|
@ -61,7 +61,7 @@ new default implementation [`InMemoryLookupKB`](/api/kb_in_memory).
|
||||||
`vectors`.
|
`vectors`.
|
||||||
- Correctly handle missing annotations in the edit tree lemmatizer.
|
- Correctly handle missing annotations in the edit tree lemmatizer.
|
||||||
|
|
||||||
### Trained pipeline updates {#pipelines}
|
### Trained pipeline updates {id="pipelines"}
|
||||||
|
|
||||||
- The CNN pipelines add `IS_SPACE` as a `tok2vec` feature for `tagger` and
|
- The CNN pipelines add `IS_SPACE` as a `tok2vec` feature for `tagger` and
|
||||||
`morphologizer` components to improve tagging of non-whitespace vs. whitespace
|
`morphologizer` components to improve tagging of non-whitespace vs. whitespace
|
||||||
|
@ -74,15 +74,15 @@ new default implementation [`InMemoryLookupKB`](/api/kb_in_memory).
|
||||||
in the
|
in the
|
||||||
[v1.2.0 release notes](https://github.com/explosion/spacy-transformers/releases/tag/v1.2.0).
|
[v1.2.0 release notes](https://github.com/explosion/spacy-transformers/releases/tag/v1.2.0).
|
||||||
|
|
||||||
## Notes about upgrading from v3.4 {#upgrading}
|
## Notes about upgrading from v3.4 {id="upgrading"}
|
||||||
|
|
||||||
### Validation of textcat values {#textcat-validation}
|
### Validation of textcat values {id="textcat-validation"}
|
||||||
|
|
||||||
An error is now raised when unsupported values are given as input to train a
|
An error is now raised when unsupported values are given as input to train a
|
||||||
`textcat` or `textcat_multilabel` model - ensure that values are `0.0` or `1.0`
|
`textcat` or `textcat_multilabel` model - ensure that values are `0.0` or `1.0`
|
||||||
as explained in the [docs](/api/textcategorizer#assigned-attributes).
|
as explained in the [docs](/api/textcategorizer#assigned-attributes).
|
||||||
|
|
||||||
### Updated default scores for tokenization and textcat {#scores}
|
### Updated default scores for tokenization and textcat {id="scores"}
|
||||||
|
|
||||||
We fixed a bug that inflated the `token_acc` scores in v3.0-v3.4. The reported
|
We fixed a bug that inflated the `token_acc` scores in v3.0-v3.4. The reported
|
||||||
`token_acc` will drop from v3.4 to v3.5, but if `token_p/r/f` stay the same,
|
`token_acc` will drop from v3.4 to v3.5, but if `token_p/r/f` stay the same,
|
||||||
|
@ -97,7 +97,7 @@ For new `textcat` or `textcat_multilabel` configs, the new default `v2` scorers:
|
||||||
- custom scorers can be used to score multiple `textcat` and
|
- custom scorers can be used to score multiple `textcat` and
|
||||||
`textcat_multilabel` components with the built-in `Scorer.score_cats` scorer
|
`textcat_multilabel` components with the built-in `Scorer.score_cats` scorer
|
||||||
|
|
||||||
### Pipeline package version compatibility {#version-compat}
|
### Pipeline package version compatibility {id="version-compat"}
|
||||||
|
|
||||||
> #### Using legacy implementations
|
> #### Using legacy implementations
|
||||||
>
|
>
|
Loading…
Reference in New Issue
Block a user