spaCy/spacy/pipeline
Daniël de Kok 72f7f4e68a
morphologizer: avoid recreating label tuple for each token (#9764)
* morphologizer: avoid recreating label tuple for each token

The `labels` property converts the dictionary key set to a tuple. This
property was used for every annotated token, recreating the tuple over
and over again.

Construct the tuple once in the set_annotations function and reuse it.

On a Finnish pipeline that I was experimenting with, this results in a
speedup of ~15% (~13000 -> ~15000 WPS).

* tagger: avoid recreating label tuple for each token
2021-11-30 11:58:59 +01:00
..
_parser_internals Use reference parse to initialize parser moves (#9722) 2021-11-23 14:55:55 +01:00
__init__.py Add SpanCategorizer component (#6747) 2021-06-24 12:35:27 +02:00
attributeruler.py Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.2-1 2021-10-26 11:53:50 +02:00
dep_parser.pyx Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00
entity_linker.py Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.2-1 2021-10-26 11:53:50 +02:00
entityruler.py EntityRuler improve disk load error message (#9658) 2021-11-23 16:26:05 +01:00
functions.py Add doc_cleaner component (#9659) 2021-11-23 15:33:33 +01:00
lemmatizer.py Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.2-1 2021-10-26 11:53:50 +02:00
morphologizer.pyx morphologizer: avoid recreating label tuple for each token (#9764) 2021-11-30 11:58:59 +01:00
multitask.pyx Replace negative rows with 0 in StaticVectors (#7674) 2021-04-22 18:04:15 +10:00
ner.pyx Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00
pipe.pxd TrainablePipe (#6213) 2020-10-08 21:33:49 +02:00
pipe.pyi Auto-format code with black (#9474) 2021-10-15 11:36:49 +02:00
pipe.pyx Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.2-1 2021-10-26 11:53:50 +02:00
sentencizer.pyx Add overwrite settings for more components (#9050) 2021-09-30 15:35:55 +02:00
senter.pyx Add overwrite settings for more components (#9050) 2021-09-30 15:35:55 +02:00
spancat.py Fix spancat for empty docs and zero suggestions (#9654) 2021-11-15 12:40:55 +01:00
tagger.pyx morphologizer: avoid recreating label tuple for each token (#9764) 2021-11-30 11:58:59 +01:00
textcat_multilabel.py Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.2-1 2021-10-26 11:53:50 +02:00
textcat.py Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.2-1 2021-10-26 11:53:50 +02:00
tok2vec.py 🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167) 2021-10-14 15:21:40 +02:00
trainable_pipe.pxd Refactor scoring methods to use registered functions (#8766) 2021-08-10 15:13:39 +02:00
trainable_pipe.pyx Pass excludes when serializing vocab (#8824) 2021-08-03 14:42:44 +02:00
transition_parser.pxd TrainablePipe (#6213) 2020-10-08 21:33:49 +02:00
transition_parser.pyx Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00