spaCy/spacy
Adriane Boyd fcce3600ed
Forbid OP matching 2+ tokens in DependencyMatcher (#6824)
Instead of silently using only the first token in each matched span:

* Forbid `OP: ?/*/+` through `DependencyMatcher` validation
* As a fail-safe, add warning if a token match that's not exactly one
token long is found by a token pattern.
2021-01-29 08:52:01 +08:00
..
cli Update quickstart recommendations 2021-01-28 11:14:49 +11:00
displacy Refactor Docs.is_ flags (#6044) 2020-09-17 00:14:01 +02:00
lang Merge pull request #6828 from explosion/master-tmp 2021-01-27 23:05:46 +11:00
matcher Forbid OP matching 2+ tokens in DependencyMatcher (#6824) 2021-01-29 08:52:01 +08:00
ml Avoid assuming encode.get_dim('nO') is set in tok2vec (#6800) 2021-01-24 14:37:33 +11:00
pipeline Error handling in nlp.pipe (#6817) 2021-01-29 08:51:21 +08:00
tests Forbid OP matching 2+ tokens in DependencyMatcher (#6824) 2021-01-29 08:52:01 +08:00
tokens Fix Span.char_span bug (#6816) 2021-01-26 15:50:37 +08:00
training Improve Example error handling for NER data (#6835) 2021-01-28 13:11:20 +11:00
__init__.pxd
__init__.py require_cpu functionality (#6336) 2020-12-08 14:42:40 +08:00
__main__.py Tidy up 2020-06-22 00:45:40 +02:00
about.py Set version to v3.0.0rc5 2021-01-26 14:55:41 +11:00
attrs.pxd Merge branch 'develop' into master-tmp 2020-05-21 18:39:06 +02:00
attrs.pyx Merge branch 'develop' into master-tmp 2020-05-21 18:39:06 +02:00
compat.py Use Literal type for nr_feature_tokens 2020-09-23 16:00:03 +02:00
default_config_pretraining.cfg pretrain architectures (#6451) 2020-12-08 14:41:03 +08:00
default_config.cfg Add initialize.before_init and after_init callbacks 2021-01-12 13:07:44 +01:00
errors.py Forbid OP matching 2+ tokens in DependencyMatcher (#6824) 2021-01-29 08:52:01 +08:00
glossary.py unicode -> str consistency 2020-05-24 17:20:58 +02:00
kb.pxd Revert added_strings change (#6236) 2020-10-10 18:55:07 +02:00
kb.pyx avoid empty aliases and improve UX and docs (#6840) 2021-01-29 08:51:40 +08:00
language.py Error handling in nlp.pipe (#6817) 2021-01-29 08:51:21 +08:00
lexeme.pxd Fix Lexeme.from_ptr 2020-08-10 16:43:37 +02:00
lexeme.pyx Update docs links in codebase 2020-09-04 12:58:50 +02:00
lookups.py Always serialize lookups and vectors to disk 2020-10-05 09:40:20 +02:00
morphology.pxd Add Lemmatizer and simplify related components (#5848) 2020-08-07 15:27:13 +02:00
morphology.pyx Prevent 0-length mem alloc (#6653) 2021-01-06 12:50:17 +11:00
parts_of_speech.pxd
parts_of_speech.pyx
pipe_analysis.py Tidy up and auto-format 2020-09-29 21:39:28 +02:00
schemas.py Add initialize.before_init and after_init callbacks 2021-01-12 13:07:44 +01:00
scorer.py WIP: Various small training changes (#6818) 2021-01-26 14:51:52 +11:00
strings.pxd Remove 'cleanup' of strings (#6007) 2020-09-01 16:12:15 +02:00
strings.pyx Update docs links in codebase 2020-09-04 12:58:50 +02:00
structs.pxd Add SpanGroup and Graph container types to represent arbitrary annotations (#6696) 2021-01-14 17:30:41 +11:00
symbols.pxd introduce token.has_head and refer to MISSING_DEP_ (WIP) 2021-01-12 17:17:06 +01:00
symbols.pyx introduce token.has_head and refer to MISSING_DEP_ (WIP) 2021-01-12 17:17:06 +01:00
tokenizer.pxd Simplify specials and cache checks (#6012) 2020-09-03 09:42:49 +02:00
tokenizer.pyx Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-rc3 2021-01-14 11:49:58 +01:00
typedefs.pxd Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master 2020-11-25 11:49:34 +01:00
typedefs.pyx
util.py Error handling in nlp.pipe (#6817) 2021-01-29 08:51:21 +08:00
vectors.pyx Update docs links in codebase 2020-09-04 12:58:50 +02:00
vocab.pxd Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master 2020-11-25 11:49:34 +01:00
vocab.pyx Fix Doc.copy bugs (#6809) 2021-01-25 21:40:18 +08:00