Sofie Van Landeghem
75a202ce65
TextCat updates and fixes ( #6263 )
...
* small fix in example imports
* throw error when train_corpus or dev_corpus is not a string
* small fix in custom logger example
* limit macro_auc to labels with 2 annotations
* fix typo
* also create parents of output_dir if need be
* update documentation of textcat scores
* refactor TextCatEnsemble
* fix tests for new AUC definition
* bump to 3.0.0a42
* update docs
* rename to spacy.TextCatEnsemble.v2
* spacy.TextCatEnsemble.v1 in legacy
* cleanup
* small fix
* update to 3.0.0rc2
* fix import that got lost in merge
* cursed IDE
* fix two typos
2020-10-18 14:50:41 +02:00
svlandeg
44e14ccae8
one more losses fix
2020-10-14 15:11:34 +02:00
svlandeg
0aa8851878
always return losses
2020-10-14 15:00:49 +02:00
svlandeg
68d79796c6
add test for vocab after serializing KB
2020-10-10 20:59:48 +02:00
Ines Montani
bfa3931c9d
Revert added_strings change ( #6236 )
2020-10-10 18:55:07 +02:00
Adriane Boyd
39aabf50ab
Also rename to include_static_vectors in CharEmbed
2020-10-09 11:54:48 +02:00
Sofie Van Landeghem
d093d6343b
TrainablePipe ( #6213 )
...
* rename Pipe to TrainablePipe
* split functionality between Pipe and TrainablePipe
* remove unnecessary methods from certain components
* cleanup
* hasattr(component, "pipe") should be sufficient again
* remove serialization and vocab/cfg from Pipe
* unify _ensure_examples and validate_examples
* small fixes
* hasattr checks for self.cfg and self.vocab
* make is_resizable and is_trainable properties
* serialize strings.json instead of vocab
* fix KB IO + tests
* fix typos
* more typos
* _added_strings as a set
* few more tests specifically for _added_strings field
* bump to 3.0.0a36
2020-10-08 21:33:49 +02:00
Ines Montani
064575d79d
Merge pull request #6216 from svlandeg/feature/nel-initialize
2020-10-08 11:14:12 +02:00
svlandeg
eaf5c265cb
set_kb method for entity_linker
2020-10-08 10:34:01 +02:00
Ines Montani
010956d493
Clear rule-based components on initialize
2020-10-08 09:51:31 +02:00
svlandeg
33c2d4af16
move kb_loader to initialize for NEL instead of constructor
2020-10-07 14:56:00 +02:00
svlandeg
ff9ac39c88
read entity_ruler patterns with srsly.read_jsonl.v1
2020-10-05 22:50:14 +02:00
svlandeg
193e0d5a98
add docs for entity_ruler.initialize
2020-10-05 18:04:08 +02:00
svlandeg
9eb813a35d
Merge remote-tracking branch 'upstream/develop' into fix/patterns-init
2020-10-05 17:49:44 +02:00
svlandeg
4e3ace4b8c
is_trainable method
2020-10-05 17:43:42 +02:00
svlandeg
65abd77779
add finish_update to Pipe
2020-10-05 16:23:33 +02:00
svlandeg
251b3eb4e5
add initialize method for entity_ruler
2020-10-05 14:59:13 +02:00
Sofie Van Landeghem
f4f49f5877
update blis ( #6198 )
...
* allow higher blis version
* fix typo
* bump to 3.0.0a34
* fix pins in other files
2020-10-05 14:58:56 +02:00
Ines Montani
11347f34da
Tidy up, tests and docs
2020-10-04 13:54:05 +02:00
Matthew Honnibal
96b636c2d3
Update attribute ruler
2020-10-04 13:08:21 +02:00
Ines Montani
bcd52e5486
Tidy up errors and warnings
2020-10-04 11:16:31 +02:00
Ines Montani
d3b3663942
Adjust error message and add test
2020-10-04 10:11:27 +02:00
Ines Montani
cc08c88a89
Merge pull request #6187 from svlandeg/fix/begin_training_pipe
2020-10-04 10:01:02 +02:00
svlandeg
3f657ed3a1
implement warning in __init_subclass__ instead
2020-10-03 22:34:10 +02:00
Matthew Honnibal
3b2a78720c
Upd morphologizer
2020-10-03 19:35:19 +02:00
Matthew Honnibal
4fccd2ceaf
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2020-10-03 19:13:55 +02:00
Matthew Honnibal
8ea8b7d940
Support loading labels in morphologizer
2020-10-03 19:13:42 +02:00
Ines Montani
80603f0fa5
Make SentenceRecognizer.label_data return None
...
Overwrite the method from the base class (Tagger) but don't export anything in "init labels"
2020-10-03 18:54:09 +02:00
Ines Montani
3bc3c05fcc
Tidy up and auto-format
2020-10-03 17:20:18 +02:00
Ines Montani
dd542ec6a4
Fix label initialization of textcat component ( #6190 )
2020-10-03 17:07:38 +02:00
Ines Montani
f0b30aedad
Make lemmatizers use initialize logic ( #6182 )
...
* Make lemmatizer use initialize logic and tidy up
* Fix typo
* Raise for uninitialized tables
2020-10-02 15:42:36 +02:00
Adriane Boyd
86c3ec9c2b
Refactor Token morph setting ( #6175 )
...
* Refactor Token morph setting
* Remove `Token.morph_`
* Add `Token.set_morph()`
* `0` resets `token.c.morph` to unset
* Any other values are passed to `Morphology.add`
* Add token.morph setter to set from MorphAnalysis
2020-10-01 22:21:46 +02:00
Ines Montani
f2627157c8
Update docs [ci skip]
2020-10-01 17:38:17 +02:00
Ines Montani
b799af16de
Don't raise in Pipe.initialize if not implemented
2020-09-30 00:05:27 +02:00
Ines Montani
fa47f87924
Tidy up and auto-format
2020-09-29 21:39:28 +02:00
Matthew Honnibal
a4da3120b4
Fix multitasks
2020-09-29 18:33:16 +02:00
Matthew Honnibal
0b5c72fce2
Fix incorrect docstrings
2020-09-29 18:30:38 +02:00
Matthew Honnibal
e4f535a964
Fix Pipe.labels
2020-09-29 16:55:07 +02:00
Matthew Honnibal
1fd002180e
Allow more components to use labels
2020-09-29 16:48:56 +02:00
Matthew Honnibal
99bff78617
Use labels in tagger
2020-09-29 16:48:44 +02:00
Matthew Honnibal
58c8d4b414
Add label_data property to pipeline
2020-09-29 16:22:13 +02:00
Ines Montani
f171903139
Clean up sgd and pipeline -> nlp
2020-09-29 12:20:26 +02:00
Ines Montani
42f0e4c946
Clean up
2020-09-29 12:14:08 +02:00
Matthew Honnibal
9c8b2524fe
Upd initialize args
2020-09-29 12:08:37 +02:00
Matthew Honnibal
f2d1b7feb5
Clean up sgd
2020-09-29 12:00:08 +02:00
Ines Montani
dec984a9c1
Update Language.initialize and support components/tokenizer settings
2020-09-29 11:52:45 +02:00
Matthew Honnibal
b3b6868639
Remove 'sgd' arg from component initialize
2020-09-29 11:42:35 +02:00
Ines Montani
ff9a63bfbd
begin_training -> initialize
2020-09-28 21:35:09 +02:00
Adriane Boyd
6c25e60089
Simplify string match IDs for AttributeRuler
2020-09-26 11:12:39 +02:00
Matthew Honnibal
702edf52a0
Fix attributeruler
2020-09-26 00:30:48 +02:00