spaCy/spacy
Paul O'Leary McCann 8f66176b2d Fix loss?
This rewrites the loss to not use the Thinc crossentropy code at all.
The main difference here is that the negative predictions are being
masked out (= marginalized over), but negative gradient is still being
reflected.

I'm still not sure this is exactly right but models seem to train
reliably now.
2021-07-05 18:17:10 +09:00
..
cli Merge remote-tracking branch 'upstream/master' into feature/coref 2021-05-27 13:50:32 +02:00
displacy Also exclude user hooks in displacy conversion (#7419) 2021-03-12 09:41:59 +01:00
lang Merge remote-tracking branch 'upstream/develop' into feature/coref 2021-05-18 17:00:17 +09:00
matcher Fix span offsets for Matcher(as_spans) on spans (#7992) 2021-05-06 18:42:44 +10:00
ml Fix loss? 2021-07-05 18:17:10 +09:00
pipeline Fix loss? 2021-07-05 18:17:10 +09:00
tests Add test for crossing spans 2021-06-28 18:21:00 +09:00
tokens Merge remote-tracking branch 'upstream/master' into feature/coref 2021-05-27 13:50:32 +02:00
training Merge remote-tracking branch 'upstream/master' into feature/coref 2021-05-27 13:50:32 +02:00
__init__.pxd * Seems to be working after refactor. Need to wire up more POS tag features, and wire up save/load of POS tags. 2014-10-24 02:23:42 +11:00
__init__.py Add vocab kwarg back to spacy.load 2021-03-11 10:58:59 +01:00
__main__.py Tidy up 2020-06-22 00:45:40 +02:00
about.py Set version to v3.0.6 (#7854) 2021-04-22 16:33:26 +02:00
attrs.pxd Merge branch 'develop' into master-tmp 2020-05-21 18:39:06 +02:00
attrs.pyx Merge branch 'develop' into master-tmp 2020-05-21 18:39:06 +02:00
compat.py Use Literal type for nr_feature_tokens 2020-09-23 16:00:03 +02:00
coref_scorer.py Add new coref scoring 2021-05-21 15:56:40 +09:00
default_config_pretraining.cfg pretrain architectures (#6451) 2020-12-08 14:41:03 +08:00
default_config.cfg Add training option to set annotations on update (#7767) 2021-04-26 16:53:53 +02:00
errors.py Custom warning if the doc_bin is too large (#8069) 2021-05-17 15:48:40 +02:00
glossary.py Add Chinese PTB tags to glossary (#7993) 2021-05-06 18:43:03 +10:00
kb.pxd Replace cpdef variables with cdef (#7834) 2021-04-26 16:54:02 +02:00
kb.pyx KB & NEL to/from bytes (#8113) 2021-05-20 18:11:30 +10:00
language.py Merge remote-tracking branch 'upstream/master' into feature/coref 2021-05-27 13:50:32 +02:00
lexeme.pxd Fix Lexeme.from_ptr 2020-08-10 16:43:37 +02:00
lexeme.pyx reduce memory load when reading all vectors from file (#6945) 2021-02-07 08:05:43 +08:00
lookups.py Update load_lookups return type and docstring (#7907) 2021-04-27 09:13:39 +02:00
morphology.pxd Clean up Morphology imports and definitions (#7441) 2021-04-26 16:54:23 +02:00
morphology.pyx Clean up Morphology imports and definitions (#7441) 2021-04-26 16:54:23 +02:00
parts_of_speech.pxd Add support for Universal Dependencies v2.0 2017-03-03 13:17:34 +01:00
parts_of_speech.pyx Drop Python 2.7 and 3.5 (#4828) 2019-12-22 01:53:56 +01:00
pipe_analysis.py Tidy up and auto-format 2020-09-29 21:39:28 +02:00
py.typed Add py.typed 2021-03-16 09:48:31 +01:00
schemas.py Add training option to set annotations on update (#7767) 2021-04-26 16:53:53 +02:00
scorer.py Merge branch 'master' into feature/coref 2021-05-15 20:05:17 +09:00
strings.pxd Remove 'cleanup' of strings (#6007) 2020-09-01 16:12:15 +02:00
strings.pyx Make vocab update in get_docs deterministic (#7603) 2021-04-09 11:53:13 +02:00
structs.pxd Add SpanGroup and Graph container types to represent arbitrary annotations (#6696) 2021-01-14 17:30:41 +11:00
symbols.pxd introduce token.has_head and refer to MISSING_DEP_ (WIP) 2021-01-12 17:17:06 +01:00
symbols.pyx introduce token.has_head and refer to MISSING_DEP_ (WIP) 2021-01-12 17:17:06 +01:00
tokenizer.pxd Replace cpdef variables with cdef (#7834) 2021-04-26 16:54:02 +02:00
tokenizer.pyx Fix tokenizer cache flushing (#7836) 2021-04-22 18:14:57 +10:00
typedefs.pxd Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master 2020-11-25 11:49:34 +01:00
typedefs.pyx Tidy up rest 2017-10-27 21:07:59 +02:00
util.py Merge remote-tracking branch 'upstream/develop' into feature/coref 2021-05-18 17:00:17 +09:00
vectors.pyx Fix vectors data on GPU (#7626) 2021-04-19 18:30:03 +10:00
vocab.pxd Replace cpdef variables with cdef (#7834) 2021-04-26 16:54:02 +02:00
vocab.pyx Skip vector ngram backoff if minn is not set (#7925) 2021-05-06 18:34:35 +10:00