Commit Graph

362 Commits

Author SHA1 Message Date
svlandeg
ff9ac39c88 read entity_ruler patterns with srsly.read_jsonl.v1 2020-10-05 22:50:14 +02:00
svlandeg
193e0d5a98 add docs for entity_ruler.initialize 2020-10-05 18:04:08 +02:00
svlandeg
9eb813a35d Merge remote-tracking branch 'upstream/develop' into fix/patterns-init 2020-10-05 17:49:44 +02:00
svlandeg
4e3ace4b8c is_trainable method 2020-10-05 17:43:42 +02:00
svlandeg
65abd77779 add finish_update to Pipe 2020-10-05 16:23:33 +02:00
svlandeg
251b3eb4e5 add initialize method for entity_ruler 2020-10-05 14:59:13 +02:00
Sofie Van Landeghem
f4f49f5877
update blis (#6198)
* allow higher blis version

* fix typo

* bump to 3.0.0a34

* fix pins in other files
2020-10-05 14:58:56 +02:00
Ines Montani
11347f34da Tidy up, tests and docs 2020-10-04 13:54:05 +02:00
Matthew Honnibal
96b636c2d3 Update attribute ruler 2020-10-04 13:08:21 +02:00
Ines Montani
bcd52e5486 Tidy up errors and warnings 2020-10-04 11:16:31 +02:00
Ines Montani
d3b3663942 Adjust error message and add test 2020-10-04 10:11:27 +02:00
Ines Montani
cc08c88a89
Merge pull request #6187 from svlandeg/fix/begin_training_pipe 2020-10-04 10:01:02 +02:00
svlandeg
3f657ed3a1 implement warning in __init_subclass__ instead 2020-10-03 22:34:10 +02:00
Matthew Honnibal
3b2a78720c Upd morphologizer 2020-10-03 19:35:19 +02:00
Matthew Honnibal
4fccd2ceaf Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-10-03 19:13:55 +02:00
Matthew Honnibal
8ea8b7d940 Support loading labels in morphologizer 2020-10-03 19:13:42 +02:00
Ines Montani
80603f0fa5 Make SentenceRecognizer.label_data return None
Overwrite the method from the base class (Tagger) but don't export anything in "init labels"
2020-10-03 18:54:09 +02:00
Ines Montani
3bc3c05fcc Tidy up and auto-format 2020-10-03 17:20:18 +02:00
Ines Montani
dd542ec6a4
Fix label initialization of textcat component (#6190) 2020-10-03 17:07:38 +02:00
Ines Montani
f0b30aedad
Make lemmatizers use initialize logic (#6182)
* Make lemmatizer use initialize logic and tidy up

* Fix typo

* Raise for uninitialized tables
2020-10-02 15:42:36 +02:00
Adriane Boyd
86c3ec9c2b
Refactor Token morph setting (#6175)
* Refactor Token morph setting

* Remove `Token.morph_`
* Add `Token.set_morph()`
  * `0` resets `token.c.morph` to unset
  * Any other values are passed to `Morphology.add`

* Add token.morph setter to set from MorphAnalysis
2020-10-01 22:21:46 +02:00
Ines Montani
f2627157c8 Update docs [ci skip] 2020-10-01 17:38:17 +02:00
Ines Montani
b799af16de Don't raise in Pipe.initialize if not implemented 2020-09-30 00:05:27 +02:00
Ines Montani
fa47f87924 Tidy up and auto-format 2020-09-29 21:39:28 +02:00
Matthew Honnibal
a4da3120b4 Fix multitasks 2020-09-29 18:33:16 +02:00
Matthew Honnibal
0b5c72fce2 Fix incorrect docstrings 2020-09-29 18:30:38 +02:00
Matthew Honnibal
e4f535a964 Fix Pipe.labels 2020-09-29 16:55:07 +02:00
Matthew Honnibal
1fd002180e Allow more components to use labels 2020-09-29 16:48:56 +02:00
Matthew Honnibal
99bff78617 Use labels in tagger 2020-09-29 16:48:44 +02:00
Matthew Honnibal
58c8d4b414 Add label_data property to pipeline 2020-09-29 16:22:13 +02:00
Ines Montani
f171903139 Clean up sgd and pipeline -> nlp 2020-09-29 12:20:26 +02:00
Ines Montani
42f0e4c946 Clean up 2020-09-29 12:14:08 +02:00
Matthew Honnibal
9c8b2524fe Upd initialize args 2020-09-29 12:08:37 +02:00
Matthew Honnibal
f2d1b7feb5 Clean up sgd 2020-09-29 12:00:08 +02:00
Ines Montani
dec984a9c1 Update Language.initialize and support components/tokenizer settings 2020-09-29 11:52:45 +02:00
Matthew Honnibal
b3b6868639 Remove 'sgd' arg from component initialize 2020-09-29 11:42:35 +02:00
Ines Montani
ff9a63bfbd begin_training -> initialize 2020-09-28 21:35:09 +02:00
Adriane Boyd
6c25e60089 Simplify string match IDs for AttributeRuler 2020-09-26 11:12:39 +02:00
Matthew Honnibal
702edf52a0 Fix attributeruler 2020-09-26 00:30:48 +02:00
Matthew Honnibal
821f37254c Fix attributeruler 2020-09-26 00:19:53 +02:00
Matthew Honnibal
98327f66a9 Fix attributeruler key 2020-09-25 23:20:50 +02:00
Matthew Honnibal
16475528f7
Fix skipped documents in entity scorer (#6137)
* Fix skipped documents in entity scorer

* Add back the skipping of unannotated entities

* Update spacy/scorer.py

* Use more specific NER scorer

* Fix import

* Fix get_ner_prf

* Add scorer

* Fix scorer

Co-authored-by: Ines Montani <ines@ines.io>
2020-09-24 20:38:57 +02:00
Ines Montani
0b52b6904c Update entity_linker.py 2020-09-24 17:10:35 +02:00
Adriane Boyd
59340606b7
Add option to disable Matcher errors (#6125)
* Add option to disable Matcher errors

* Add option to disable Matcher errors when a doc doesn't contain a
particular type of annotation

Minor additional change:

* Update `AttributeRuler.load_from_morph_rules` to allow direct `MORPH`
values

* Rename suppress_errors to allow_missing

Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>

* Refactor annotation checks in Matcher and PhraseMatcher

Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2020-09-24 16:54:39 +02:00
Sofie Van Landeghem
c7eedd3534
updates to NEL functionality (#6132)
* NEL: read sentences and ents from reference

* fiddling with sent_start annotations

* add KB serialization test

* KB write additional file with strings.json

* score_links function to calculate NEL P/R/F

* formatting

* documentation
2020-09-24 16:53:59 +02:00
Ines Montani
ae51f580c1 Fix handling of score_weights 2020-09-24 10:27:33 +02:00
svlandeg
dd2292793f 'parser' instead of 'deps' for state_type 2020-09-23 16:53:49 +02:00
svlandeg
6c85fab316 state_type and extra_state_tokens instead of nr_feature_tokens 2020-09-23 13:35:09 +02:00
Sofie Van Landeghem
d53c84b6d6
avoid None callback (#6100) 2020-09-22 13:54:44 +02:00
svlandeg
781fae678b Merge remote-tracking branch 'upstream/develop' into fix/corpus 2020-09-17 09:24:36 +02:00