2020-09-12 18:05:10 +03:00
import { Help } from 'components/typography'; import Link from 'components/link'
2020-10-14 15:58:45 +03:00
<!-- TODO: update speed and v2 NER numbers -->
2020-09-12 18:05:10 +03:00
< figure >
2020-09-24 14:41:25 +03:00
| Pipeline | Parser | Tagger | NER | WPS< br / > CPU < Help > words per second on CPU, higher is better< / Help > | WPS< br / > GPU < Help > words per second on GPU, higher is better< / Help > |
2020-09-23 23:02:31 +03:00
| ---------------------------------------------------------- | -----: | -----: | ---: | ------------------------------------------------------------------: | -----------------------------------------------------------------: |
2020-10-15 13:35:30 +03:00
| [`en_core_web_trf` ](/models/en#en_core_web_trf ) (spaCy v3) | 95.5 | 98.3 | 89.4 | 1k | 8k |
2020-10-15 12:16:06 +03:00
| [`en_core_web_lg` ](/models/en#en_core_web_lg ) (spaCy v3) | 92.2 | 97.4 | 85.4 | 7k | |
| `en_core_web_lg` (spaCy v2) | 91.9 | 97.2 | 85.7 | 10k | |
2020-09-12 18:05:10 +03:00
< figcaption class = "caption" >
2020-09-24 11:13:41 +03:00
**Full pipeline accuracy and speed** on the
2020-10-15 12:16:06 +03:00
[OntoNotes 5.0 ](https://catalog.ldc.upenn.edu/LDC2013T19 ) corpus (reported on
the development set).
2020-09-12 18:05:10 +03:00
< / figcaption >
< / figure >
< figure >
2020-09-24 14:41:25 +03:00
| Named Entity Recognition System | OntoNotes | CoNLL '03 |
2020-09-24 13:37:21 +03:00
| ------------------------------------------------------------------------------ | --------: | --------: |
2020-10-14 15:58:45 +03:00
| spaCy RoBERTa (2020) | 89.7 | 91.6 |
2020-10-15 09:58:30 +03:00
| spaCy CNN (2020) | 84.5 | 87.4 |
2020-09-24 13:37:21 +03:00
| [Stanza ](https://stanfordnlp.github.io/stanza/ ) (StanfordNLP)< sup > 1</ sup > | 88.8 | 92.1 |
| < Link to = "https://github.com/flairNLP/flair" hideIcon > Flair< / Link > < sup > 2< / sup > | 89.7 | 93.1 |
| BERT Base< sup > 3< / sup > | - | 92.4 |
2020-09-12 18:05:10 +03:00
< figcaption class = "caption" >
2020-09-23 23:02:31 +03:00
**Named entity recognition accuracy** on the
[OntoNotes 5.0 ](https://catalog.ldc.upenn.edu/LDC2013T19 ) and
[CoNLL-2003 ](https://www.aclweb.org/anthology/W03-0419.pdf ) corpora. See
[NLP-progress ](http://nlpprogress.com/english/named_entity_recognition.html ) for
2020-10-15 09:58:30 +03:00
more results. Project template:
[`benchmarks/ner_conll03` ](%%GITHUB_PROJECTS/benchmarks/ner_conll03 ). **1. **
[Qi et al. (2020) ](https://arxiv.org/pdf/2003.07082.pdf ). **2. **
[Akbik et al. (2018) ](https://www.aclweb.org/anthology/C18-1139/ ). **3. **
[Devlin et al. (2018) ](https://arxiv.org/abs/1810.04805 ).
2020-09-12 18:05:10 +03:00
< / figcaption >
< / figure >