spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-07-14 18:22:27 +03:00

History

Matthew Honnibal 78d79d94ce Guess set_annotations=True in nlp.update During `nlp.update`, components can be passed a boolean set_annotations to indicate whether they should assign annotations to the `Doc`. This needs to be called if downstream components expect to use the annotations during training, e.g. if we wanted to use tagger features in the parser. Components can specify their assignments and requirements, so we can figure out which components have these inter-dependencies. After figuring this out, we can guess whether to pass set_annotations=True. We could also call set_annotations=True always, or even just have this as the only behaviour. The downside of this is that it would require the `Doc` objects to be created afresh to avoid problematic modifications. One approach would be to make a fresh copy of the `Doc` objects within `nlp.update()`, so that we can write to the objects without any problems. If we do that, we can drop this logic and also drop the `set_annotations` mechanism. I would be fine with that approach, although it runs the risk of introducing some performance overhead, and we'll have to take care to copy all extension attributes etc.		2020-05-22 15:55:45 +02:00
..
cli	Tweak memory management in train_from_config	2020-05-21 19:32:04 +02:00
displacy	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
lang	Remove "pala" tokenizer exception for Spanish (#5265 )	2020-04-09 10:21:20 +02:00
matcher	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
ml	Fix shape inference	2020-05-21 20:46:10 +02:00
pipeline	Fix begin_training	2020-05-21 20:46:21 +02:00
syntax	Fix begin_training	2020-05-21 20:46:21 +02:00
tests	Guess set_annotations=True in nlp.update	2020-05-22 15:55:45 +02:00
tokens	Update morphologizer (#5108 )	2020-04-02 14:46:32 +02:00
__init__.pxd	* Seems to be working after refactor. Need to wire up more POS tag features, and wire up save/load of POS tags.	2014-10-24 02:23:42 +11:00
__init__.py	Simplify warnings	2020-02-28 12:20:23 +01:00
__main__.py	Update spaCy for thinc 8.0.0 (#4920 )	2020-01-29 17:06:46 +01:00
_ml.py	take care of global vectors in multiprocessing (#5081 )	2020-03-03 13:58:22 +01:00
about.py	Set version to v3.0.0.dev9	2020-05-21 20:47:52 +02:00
analysis.py	Simplify warnings	2020-02-28 12:20:23 +01:00
attrs.pxd	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
attrs.pyx	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
compat.py	Merge branch 'develop' into refactor/remove-symlinks	2020-02-18 17:22:20 +01:00
errors.py	Various fixes to NEL functionality, Example class etc (#5460 )	2020-05-20 11:41:12 +02:00
glossary.py	Tidy up and auto-format	2020-02-18 15:38:18 +01:00
gold.pxd	Fix accidentally quadratic runtime in Example.split_sents (#5464 )	2020-05-20 18:48:18 +02:00
gold.pyx	Fix accidentally quadratic runtime in Example.split_sents (#5464 )	2020-05-20 18:48:18 +02:00
kb.pxd	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00
kb.pyx	Merge branch 'develop' into refactor/simplify-warnings	2020-03-04 16:38:55 +01:00
language.py	Guess set_annotations=True in nlp.update	2020-05-22 15:55:45 +02:00
lemmatizer.py	Drop Python 2.7 and 3.5 (#4828 )	2019-12-22 01:53:56 +01:00
lexeme.pxd	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00
lexeme.pyx	Simplify warnings	2020-02-28 12:20:23 +01:00
lookups.py	Drop Python 2.7 and 3.5 (#4828 )	2019-12-22 01:53:56 +01:00
morphology.pxd	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00
morphology.pyx	Fix small errors	2020-03-26 13:47:31 +01:00
parts_of_speech.pxd	Add support for Universal Dependencies v2.0	2017-03-03 13:17:34 +01:00
parts_of_speech.pyx	Drop Python 2.7 and 3.5 (#4828 )	2019-12-22 01:53:56 +01:00
schemas.py	Add sent_start to pattern schema	2020-03-26 14:05:40 +01:00
scorer.py	Update morphologizer (#5108 )	2020-04-02 14:46:32 +02:00
strings.pxd	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00
strings.pyx	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00
structs.pxd	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00
symbols.pxd	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
symbols.pyx	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
tokenizer.pxd	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00
tokenizer.pyx	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
typedefs.pxd	Update spaCy for thinc 8.0.0 (#4920 )	2020-01-29 17:06:46 +01:00
typedefs.pyx	Tidy up rest	2017-10-27 21:07:59 +02:00
util.py	Merge from develop	2020-05-20 12:27:31 +02:00
vectors.pyx	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
vocab.pxd	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00
vocab.pyx	Tidy up and auto-format	2020-02-18 15:38:18 +01:00