mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-26 01:46:28 +03:00
Update docs [ci skip]
This commit is contained in:
parent
4fa869e6f7
commit
7f05ccc170
|
@ -6,7 +6,7 @@ import { Help } from 'components/typography'; import Link from 'components/link'
|
|||
|
||||
| Pipeline | Parser | Tagger | NER | WPS<br />CPU <Help>words per second on CPU, higher is better</Help> | WPS<br/>GPU <Help>words per second on GPU, higher is better</Help> |
|
||||
| ---------------------------------------------------------- | -----: | -----: | ---: | ------------------------------------------------------------------: | -----------------------------------------------------------------: |
|
||||
| [`en_core_web_trf`](/models/en#en_core_web_trf) (spaCy v3) | 95.5 | 98.3 | 89.7 | 1k | 8k |
|
||||
| [`en_core_web_trf`](/models/en#en_core_web_trf) (spaCy v3) | 95.5 | 98.3 | 89.4 | 1k | 8k |
|
||||
| [`en_core_web_lg`](/models/en#en_core_web_lg) (spaCy v3) | 92.2 | 97.4 | 85.4 | 7k | |
|
||||
| `en_core_web_lg` (spaCy v2) | 91.9 | 97.2 | 85.7 | 10k | |
|
||||
|
||||
|
|
|
@ -77,6 +77,26 @@ import Benchmarks from 'usage/\_benchmarks-models.md'
|
|||
|
||||
<Benchmarks />
|
||||
|
||||
#### New trained transformer-based pipelines {#features-transformers-pipelines}
|
||||
|
||||
> #### Notes on model capabilities
|
||||
>
|
||||
> The models are each trained with a **single transformer** shared across the
|
||||
> pipeline, which requires it to be trained on a single corpus. For
|
||||
> [English](/models/en) and [Chinese](/models/zh), we used the OntoNotes 5
|
||||
> corpus, which has annotations across several tasks. For [French](/models/fr),
|
||||
> [Spanish](/models/es) and [German](/models/de), we didn't have a suitable
|
||||
> corpus that had both syntactic and entity annotations, so the transformer
|
||||
> models for those languages do not include NER.
|
||||
|
||||
| Package | Language | Transformer | Tagger | Parser | NER |
|
||||
| ------------------------------------------------ | -------- | --------------------------------------------------------------------------------------------- | -----: | -----: | ---: |
|
||||
| [`en_core_web_trf`](/models/en#en_core_web_trf) | English | [`roberta-base`](https://huggingface.co/roberta-base) | 97.8 | 95.0 | 89.4 |
|
||||
| [`de_dep_news_trf`](/models/de#de_dep_news_trf) | German | [`bert-base-german-cased`](https://huggingface.co/bert-base-german-cased) | 99.0 | 95.8 | - |
|
||||
| [`es_dep_news_trf`](/models/es#es_dep_news_trf) | Spanish | [`bert-base-spanish-wwm-cased`](https://huggingface.co/dccuchile/bert-base-spanish-wwm-cased) | 98.2 | 94.6 | - |
|
||||
| [`fr_dep_news_trf`](/models/fr#fr_dep_news_trf) | French | [`camembert-base`](https://huggingface.co/camembert-base) | 95.7 | 94.9 | - |
|
||||
| [`zh_core_web_trf`](/models/zh#zh_core_news_trf) | Chinese | [`bert-base-chinese`](https://huggingface.co/bert-base-chinese) | 92.5 | 77.2 | 75.6 |
|
||||
|
||||
<Infobox title="Details & Documentation" emoji="📖" list>
|
||||
|
||||
- **Usage:** [Embeddings & Transformers](/usage/embeddings-transformers),
|
||||
|
@ -88,11 +108,6 @@ import Benchmarks from 'usage/\_benchmarks-models.md'
|
|||
- **Architectures: ** [TransformerModel](/api/architectures#TransformerModel),
|
||||
[TransformerListener](/api/architectures#TransformerListener),
|
||||
[Tok2VecTransformer](/api/architectures#Tok2VecTransformer)
|
||||
- **Trained Pipelines:** [`en_core_web_trf`](/models/en#en_core_web_trf),
|
||||
[`de_dep_news_trf`](/models/de#de_dep_news_trf),
|
||||
[`es_dep_news_trf`](/models/es#es_dep_news_trf),
|
||||
[`fr_dep_news_trf`](/models/fr#fr_dep_news_trf),
|
||||
[`zh_core_web_trf`](/models/zh#zh_core_web_trf)
|
||||
- **Implementation:**
|
||||
[`spacy-transformers`](https://github.com/explosion/spacy-transformers)
|
||||
|
||||
|
|
Loading…
Reference in New Issue
Block a user