spaCy/website/docs/usage/_benchmarks-models.md
2020-10-15 11:16:06 +02:00

2.8 KiB

import { Help } from 'components/typography'; import Link from 'components/link'

Pipeline Parser Tagger NER WPS
CPU words per second on CPU, higher is better
WPS
GPU words per second on GPU, higher is better
en_core_web_trf (spaCy v3) 95.5 98.3 89.7 1k 8k
en_core_web_lg (spaCy v3) 92.2 97.4 85.4 7k
en_core_web_lg (spaCy v2) 91.9 97.2 85.7 10k

Full pipeline accuracy and speed on the OntoNotes 5.0 corpus (reported on the development set).

Named Entity Recognition System OntoNotes CoNLL '03
spaCy RoBERTa (2020) 89.7 91.6
spaCy CNN (2020) 84.5 87.4
Stanza (StanfordNLP)1 88.8 92.1
Flair2 89.7 93.1
BERT Base3 - 92.4

Named entity recognition accuracy on the OntoNotes 5.0 and CoNLL-2003 corpora. See NLP-progress for more results. Project template: benchmarks/ner_conll03. **1. ** Qi et al. (2020). **2. ** Akbik et al. (2018). **3. ** Devlin et al. (2018).