spaCy/_benchmarks-models.md at 4fa869e6f72b152ebf15632ae1bceb18f5b03017

mirror of https://github.com/explosion/spaCy.git synced 2024-09-21 19:39:13 +03:00

Ines Montani 4fa869e6f7 Update docs [ci skip]

2020-10-15 11:16:06 +02:00

import { Help } from 'components/typography'; import Link from 'components/link'

Pipeline	Parser	Tagger	NER	WPS CPU words per second on CPU, higher is better	WPS GPU words per second on GPU, higher is better
`en_core_web_trf` (spaCy v3)	95.5	98.3	89.7	1k	8k
`en_core_web_lg` (spaCy v3)	92.2	97.4	85.4	7k
`en_core_web_lg` (spaCy v2)	91.9	97.2	85.7	10k

Full pipeline accuracy and speed on the OntoNotes 5.0 corpus (reported on the development set).

Named Entity Recognition System	OntoNotes	CoNLL '03
spaCy RoBERTa (2020)	89.7	91.6
spaCy CNN (2020)	84.5	87.4
Stanza (StanfordNLP)¹	88.8	92.1
Flair²	89.7	93.1
BERT Base³	-	92.4

Named entity recognition accuracy on the OntoNotes 5.0 and CoNLL-2003 corpora. See NLP-progress for more results. Project template: benchmarks/ner_conll03. **1. ** Qi et al. (2020). **2. ** Akbik et al. (2018). **3. ** Devlin et al. (2018).