spaCy/spacy/pipeline
adrianeboyd b841d3fe75 Add a tagger-based SentenceRecognizer (#4713)
* Add sent_starts to GoldParse

* Add SentTagger pipeline component

Add `SentTagger` pipeline component as a subclass of `Tagger`.

* Model reduces default parameters from `Tagger` to be small and fast
* Hard-coded set of two labels:
  * S (1): token at beginning of sentence
  * I (0): all other sentence positions
* Sets `token.sent_start` values

* Add sentence segmentation to Scorer

Report `sent_p/r/f` for sentence boundaries, which may be provided by
various pipeline components.

* Add sentence segmentation to CLI evaluate

* Add senttagger metrics/scoring to train CLI

* Rename SentTagger to SentenceRecognizer

* Add SentenceRecognizer to spacy.pipes imports

* Add SentenceRecognizer serialization test

* Shorten component name to sentrec

* Remove duplicates from train CLI output metrics
2019-11-28 11:10:07 +01:00
..
__init__.py Add a tagger-based SentenceRecognizer (#4713) 2019-11-28 11:10:07 +01:00
entityruler.py Component decorator and component analysis (#4517) 2019-10-27 13:35:49 +01:00
functions.py Filter subtoken matches in merge_subtokens() (#4539) 2019-10-28 15:40:28 +01:00
hooks.py Component decorator and component analysis (#4517) 2019-10-27 13:35:49 +01:00
morphologizer.pyx Example class for training data (#4543) 2019-11-11 17:35:27 +01:00
pipes.pyx Add a tagger-based SentenceRecognizer (#4713) 2019-11-28 11:10:07 +01:00