spaCy/spacy
Matthew Honnibal f1d77eb140
💫 Improve handling of missing NER tags (closes #2603) (#3341)
* Improve handling of missing NER tags

GoldParse can accept missing NER tags, if entities is provided
in BILUO format (rather than as spans). Missing tags can be provided
as None values.

Fix bug that occurred when first tag was a None value. Closes #2603.

* Document specification of missing NER tags.
2019-02-27 12:06:32 +01:00
..
cli 💫 Replace {Doc,Span}.merge with Doc.retokenize (#3280) 2019-02-15 10:29:44 +01:00
data Make spacy/data a package 2017-03-18 20:04:22 +01:00
displacy Fix escaping of HTML in displacy ENT (closes #2728) 2019-02-21 14:30:39 +01:00
lang Merge branch 'master' into develop 2019-02-25 15:54:55 +01:00
matcher Fix matcher bug #3328 2019-02-27 10:25:39 +01:00
pipeline Clean up TextCategorizer slightly 2019-02-23 12:28:06 +01:00
syntax 💫 Improve handling of missing NER tags (closes #2603) (#3341) 2019-02-27 12:06:32 +01:00
tests 💫 Improve handling of missing NER tags (closes #2603) (#3341) 2019-02-27 12:06:32 +01:00
tokens Make doc[0].is_sent_start == True (closes #2869) (#3340) 2019-02-27 11:17:17 +01:00
__init__.pxd
__init__.py Tidy up and format remaining files 2018-11-30 17:43:08 +01:00
__main__.py 💫 New JSON helpers, training data internals & CLI rewrite (#2932) 2018-11-30 20:16:14 +01:00
_align.pyx Improve alignment around quotes 2018-08-16 01:04:34 +02:00
_ml.py Auto-format 2019-02-24 14:09:15 +01:00
about.py Set version to v2.1.0a9 2019-02-25 21:55:19 +01:00
attrs.pxd Fix LANG symbol 2018-02-17 18:10:50 +01:00
attrs.pyx Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop" 2018-03-27 19:23:02 +02:00
compat.py 💫 Replace ujson, msgpack and dill/pickle/cloudpickle with srsly (#3003) 2018-12-03 01:28:22 +01:00
errors.py Remove unused temp errors 2019-02-24 22:26:08 +01:00
glossary.py 💫 Tidy up and auto-format .py files (#2983) 2018-11-30 17:03:03 +01:00
gold.pxd Add support for sent_start to GoldParse 2017-08-25 20:03:14 -05:00
gold.pyx 💫 Improve handling of missing NER tags (closes #2603) (#3341) 2019-02-27 12:06:32 +01:00
language.py Add batch size argument to Language.evaluate(). Closes #3263 2019-02-25 19:30:33 +01:00
lemmatizer.py 💫 Tidy up and auto-format .py files (#2983) 2018-11-30 17:03:03 +01:00
lexeme.pxd 💫 Support lexical attributes in retokenizer attrs (closes #2390) (#3325) 2019-02-24 21:13:51 +01:00
lexeme.pyx 💫 Add .similarity warnings for no vectors and option to exclude warnings (#2197) 2018-05-21 01:22:38 +02:00
morphology.pxd Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop" 2018-03-27 19:23:02 +02:00
morphology.pyx Fix lemmatization 2018-07-05 13:56:02 +02:00
parts_of_speech.pxd
parts_of_speech.pyx Tidy up rest 2017-10-27 21:07:59 +02:00
scorer.py 💫 Tidy up and auto-format .py files (#2983) 2018-11-30 17:03:03 +01:00
strings.pxd Try to fix StringStore clean up (see #1506) 2017-11-11 03:11:27 +03:00
strings.pyx Add get_string_id helper to spacy.strings 2018-12-10 16:09:26 +01:00
structs.pxd Make NORM a token attribute (#3029) 2018-12-08 10:49:10 +01:00
symbols.pxd Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop" 2018-03-27 19:23:02 +02:00
symbols.pyx Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop" 2018-03-27 19:23:02 +02:00
tokenizer.pxd Disable tokenizer cache for special-cases. Fixes #1250 2017-10-24 16:08:05 +02:00
tokenizer.pyx Load token_match regex with .match, not .search 2019-02-21 09:09:03 +01:00
typedefs.pxd Work on changing StringStore to return hashes. 2017-05-28 12:36:27 +02:00
typedefs.pyx Tidy up rest 2017-10-27 21:07:59 +02:00
util.py Update docstrings [ci skip] 2019-02-24 18:39:59 +01:00
vectors.pyx Fix KeyError in Vectors.most_similar. Fixes #2648 2018-12-10 16:19:18 +01:00
vocab.pxd 💫 Small efficiency fixes to tokenizer (#2587) 2018-07-24 23:35:54 +02:00
vocab.pyx Prevent exceptions from setting POS but not TAG. Closes #1773 2018-12-30 13:16:05 +01:00