Commit Graph

220 Commits

Author SHA1 Message Date
svlandeg
3a505e7e14 small edit to ensure the new word was indeed new 2020-10-10 21:05:28 +02:00
svlandeg
68d79796c6 add test for vocab after serializing KB 2020-10-10 20:59:48 +02:00
Ines Montani
539b0c10da Tidy up and auto-format 2020-10-10 19:14:48 +02:00
Ines Montani
bfa3931c9d
Revert added_strings change (#6236) 2020-10-10 18:55:07 +02:00
Ines Montani
525f798841 Fix typo in test 2020-10-09 18:00:21 +02:00
Ines Montani
b7cb9d95e4
Merge pull request #6229 from svlandeg/bugfix/disabled 2020-10-09 16:05:11 +02:00
svlandeg
06b9d213fd formatting 2020-10-09 12:19:47 +02:00
svlandeg
8316bc7d4a bugfix DisabledPipes 2020-10-09 12:06:20 +02:00
Adriane Boyd
39aabf50ab Also rename to include_static_vectors in CharEmbed 2020-10-09 11:54:48 +02:00
Sofie Van Landeghem
d093d6343b
TrainablePipe (#6213)
* rename Pipe to TrainablePipe

* split functionality between Pipe and TrainablePipe

* remove unnecessary methods from certain components

* cleanup

* hasattr(component, "pipe") should be sufficient again

* remove serialization and vocab/cfg from Pipe

* unify _ensure_examples and validate_examples

* small fixes

* hasattr checks for self.cfg and self.vocab

* make is_resizable and is_trainable properties

* serialize strings.json instead of vocab

* fix KB IO + tests

* fix typos

* more typos

* _added_strings as a set

* few more tests specifically for _added_strings field

* bump to 3.0.0a36
2020-10-08 21:33:49 +02:00
Ines Montani
064575d79d
Merge pull request #6216 from svlandeg/feature/nel-initialize 2020-10-08 11:14:12 +02:00
svlandeg
eaf5c265cb set_kb method for entity_linker 2020-10-08 10:34:01 +02:00
Ines Montani
010956d493 Clear rule-based components on initialize 2020-10-08 09:51:31 +02:00
svlandeg
6b8bdb2d39 add init_config to nlp.create_pipe 2020-10-07 14:58:16 +02:00
Ines Montani
568e12215d
Merge pull request #6206 from svlandeg/fix/patterns-init 2020-10-06 10:27:23 +02:00
svlandeg
ff9ac39c88 read entity_ruler patterns with srsly.read_jsonl.v1 2020-10-05 22:50:14 +02:00
Ines Montani
126268ce50 Auto-format [ci skip] 2020-10-05 21:58:18 +02:00
Matthew Honnibal
db84d175c3 Fix test 2020-10-05 19:59:30 +02:00
Matthew Honnibal
6dcc4a0ba6 Simplify MultiHashEmbed signature 2020-10-05 19:57:45 +02:00
Matthew Honnibal
7d93575f35 spacy/tests/ 2020-10-05 15:28:12 +02:00
Matthew Honnibal
f4ca9a39cb spacy/tests/ 2020-10-05 15:27:06 +02:00
Matthew Honnibal
f2f1deca66 spacy/tests/ 2020-10-05 15:24:33 +02:00
Ines Montani
0307a228c8
Merge pull request #6193 from explosion/fix/adjust-pipe-init
Adjust [initialize.components] on Language.remove_pipe and Language.rename_pipe
2020-10-04 15:20:54 +02:00
Ines Montani
8f018e47f8 Adjust [initialize.components] on Language.remove_pipe and Language.rename_pipe 2020-10-04 14:43:45 +02:00
Ines Montani
11347f34da Tidy up, tests and docs 2020-10-04 13:54:05 +02:00
Ines Montani
d3b3663942 Adjust error message and add test 2020-10-04 10:11:27 +02:00
Ines Montani
2110e8f86d Auto-format 2020-10-04 10:06:49 +02:00
Matthew Honnibal
835070cedc Upd test 2020-10-03 19:35:10 +02:00
Ines Montani
c2401fca41 Add tests for Pipe.label_data 2020-10-03 19:12:46 +02:00
Ines Montani
3bc3c05fcc Tidy up and auto-format 2020-10-03 17:20:18 +02:00
Ines Montani
dd542ec6a4
Fix label initialization of textcat component (#6190) 2020-10-03 17:07:38 +02:00
Ines Montani
f0b30aedad
Make lemmatizers use initialize logic (#6182)
* Make lemmatizer use initialize logic and tidy up

* Fix typo

* Raise for uninitialized tables
2020-10-02 15:42:36 +02:00
Adriane Boyd
86c3ec9c2b
Refactor Token morph setting (#6175)
* Refactor Token morph setting

* Remove `Token.morph_`
* Add `Token.set_morph()`
  * `0` resets `token.c.morph` to unset
  * Any other values are passed to `Morphology.add`

* Add token.morph setter to set from MorphAnalysis
2020-10-01 22:21:46 +02:00
Ines Montani
fa47f87924 Tidy up and auto-format 2020-09-29 21:39:28 +02:00
Ines Montani
7851020653 Update tests 2020-09-29 18:14:15 +02:00
Ines Montani
f2352eb701 Test with default value 2020-09-29 17:00:40 +02:00
Ines Montani
63d1598137 Simplify config use in Language.initialize 2020-09-29 16:05:48 +02:00
Ines Montani
56f8bc73ef Add more tests 2020-09-29 15:23:34 +02:00
Ines Montani
591038b1a4 Add test 2020-09-29 12:54:52 +02:00
Ines Montani
ff9a63bfbd begin_training -> initialize 2020-09-28 21:35:09 +02:00
Ines Montani
822ea4ef61 Refactor CLI 2020-09-28 15:09:59 +02:00
Ines Montani
7e938ed63e Update config resolution to use new Thinc 2020-09-27 22:21:31 +02:00
Ines Montani
ca3c997062 Improve CLI config validation with latest Thinc 2020-09-26 13:13:57 +02:00
Sofie Van Landeghem
c7eedd3534
updates to NEL functionality (#6132)
* NEL: read sentences and ents from reference

* fiddling with sent_start annotations

* add KB serialization test

* KB write additional file with strings.json

* score_links function to calculate NEL P/R/F

* formatting

* documentation
2020-09-24 16:53:59 +02:00
Ines Montani
d0ef4a4cf5 Prevent division by zero in score weights 2020-09-24 16:42:13 +02:00
Ines Montani
c6c67b606e
Merge pull request #6133 from explosion/fix/score_weights 2020-09-24 12:00:57 +02:00
Ines Montani
4bbe41f017 Fix combined scores and update test 2020-09-24 10:42:47 +02:00
Sofie Van Landeghem
c645c4e7ce
fix micro PRF for textcat (#6130)
* fix micro PRF for textcat

* small fix
2020-09-24 10:31:17 +02:00
Ines Montani
ae51f580c1 Fix handling of score_weights 2020-09-24 10:27:33 +02:00
Sofie Van Landeghem
86a08f819d
tok2vec.update instead of predict (#6113) 2020-09-22 21:54:52 +02:00