Commit Graph

1340 Commits

Author SHA1 Message Date
Adriane Boyd
1442d2f213
Improve simple training example in v3 migration (#6438)
* Create the examples once
* Use the examples in the initialization
* Provide the batch size
* Fix `begin_training` migration example
2020-11-30 09:39:45 +08:00
Sofie Van Landeghem
165993d8e5
fix typo in transformer docs (#6404) 2020-11-19 14:11:38 +01:00
Adriane Boyd
96726ec1f6
Fix DocBin init in training example (#6396) 2020-11-17 14:36:44 +01:00
Ines Montani
de6453940e
Merge pull request #6305 from svlandeg/feature/score-docs [ci skip] 2020-11-10 02:52:11 +01:00
Ines Montani
d7950c5ada
Merge pull request #6297 from adrianeboyd/docs/nightly-conda-install [ci skip] 2020-11-10 02:45:52 +01:00
Ines Montani
363ac73c72 Update docs [ci skip] 2020-11-09 12:43:26 +08:00
Sofie Van Landeghem
8ef056cf98
fix embed_size in Entity Linker architecture (#6343) 2020-11-04 22:20:13 +01:00
Ines Montani
019a1dd5e8 Fix v3 overview [ci skip] 2020-11-03 18:10:06 +01:00
Adriane Boyd
a4b32b9552
Handle missing reference values in scorer (#6286)
* Handle missing reference values in scorer

Handle missing values in reference doc during scoring where it is
possible to detect an unset state for the attribute. If no reference
docs contain annotation, `None` is returned instead of a score. `spacy
evaluate` displays `-` for missing scores and the missing scores are
saved as `None`/`null` in the metrics.

Attributes without unset states:

* `token.head`: relies on `token.dep` to recognize unset values
* `doc.cats`: unable to handle missing annotation

Additional changes:

* add optional `has_annotation` check to `score_scans` to replace
`doc.sents` hack
* update `score_token_attr_per_feat` to handle missing and empty morph
representations
* fix bug in `Doc.has_annotation` for normalization of `IS_SENT_START`
vs. `SENT_START`

* Fix import

* Update return types
2020-11-03 15:47:18 +01:00
Adriane Boyd
dc816bba9d
Fix node name typo in dependency matcher example (#6311) 2020-10-28 16:32:46 +01:00
svlandeg
77688b0072 fix config 2020-10-26 11:14:34 +01:00
svlandeg
5878ff6bcd cleanup 2020-10-26 11:13:02 +01:00
svlandeg
e95d9caa87 small edits 2020-10-26 11:09:25 +01:00
svlandeg
a664994a81 adding score method to explanation of new component 2020-10-26 10:52:47 +01:00
Adriane Boyd
c0b76f4c19 Add install step to "Compile from source" 2020-10-23 11:36:36 +02:00
Ines Montani
b6b1c1e23c
Merge pull request #6271 from walterhenry/develop-proof [ci skip] 2020-10-19 16:31:43 +02:00
walterhenry
db24dc5614 Proofread remarks
I think these may the last remarks for the nightly docs. Only two minor things actually.
2020-10-19 11:11:32 +02:00
Sofie Van Landeghem
75a202ce65
TextCat updates and fixes (#6263)
* small fix in example imports

* throw error when train_corpus or dev_corpus is not a string

* small fix in custom logger example

* limit macro_auc to labels with 2 annotations

* fix typo

* also create parents of output_dir if need be

* update documentation of textcat scores

* refactor TextCatEnsemble

* fix tests for new AUC definition

* bump to 3.0.0a42

* update docs

* rename to spacy.TextCatEnsemble.v2

* spacy.TextCatEnsemble.v1 in legacy

* cleanup

* small fix

* update to 3.0.0rc2

* fix import that got lost in merge

* cursed IDE

* fix two typos
2020-10-18 14:50:41 +02:00
Ines Montani
c655742b8b Remove docs references to starters for now (see #6262) [ci skip] 2020-10-16 15:46:34 +02:00
Ines Montani
c968d1560f Fix docs example [ci skip] 2020-10-16 11:33:20 +02:00
Ines Montani
ba1e004049 Fix typo [ci skip] 2020-10-15 23:39:04 +02:00
Ines Montani
20f80587d6
Merge pull request #6257 from walterhenry/develop-proof
A few tiny typo fixes to push through with release of nightly
2020-10-15 18:17:30 +02:00
walterhenry
75b7f86383 Three small typos
Some little typos since v3.0 is out.
2020-10-15 18:06:37 +02:00
Ines Montani
09dbbe75d7 Update docs [ci skip] 2020-10-15 17:27:24 +02:00
Ines Montani
7f05ccc170 Update docs [ci skip] 2020-10-15 12:35:30 +02:00
Ines Montani
4fa869e6f7 Update docs [ci skip] 2020-10-15 11:16:06 +02:00
Ines Montani
178760855f Merge branch 'develop' into master-tmp 2020-10-15 09:06:03 +02:00
Ines Montani
abeafcbc08 Update docs [ci skip] 2020-10-15 08:58:30 +02:00
Ines Montani
a966c271f7 Update models docs [ci skip] 2020-10-14 20:50:23 +02:00
Ines Montani
a2d4aaee70
Apply suggestions from code review 2020-10-14 19:51:36 +02:00
Ines Montani
d94e241fce Merge branch 'develop' into pr/6253 2020-10-14 16:55:46 +02:00
Ines Montani
cb47f25cda
Merge pull request #6252 from svlandeg/fix/docs 2020-10-14 16:43:12 +02:00
walterhenry
6af585dba5 New batch of proofs
Just tiny fixes to the docs as a proofreader
2020-10-14 16:37:57 +02:00
svlandeg
478a14a619 fix few typos 2020-10-14 15:01:19 +02:00
Ines Montani
1aa8e8f2af Update docs [ci skip] 2020-10-14 14:58:45 +02:00
Ines Montani
4d99d2b94a Update docs [ci skip] 2020-10-13 11:38:52 +02:00
svlandeg
40276fd3be update NEL docs after latest refactor 2020-10-12 11:41:27 +02:00
svlandeg
08cb085f6c Merge remote-tracking branch 'upstream/develop' into fix/various 2020-10-09 17:01:27 +02:00
Ines Montani
97ff090e49 Fix docs example [ci skip] 2020-10-09 16:03:57 +02:00
Ines Montani
9fb3244672
Merge pull request #6231 from adrianeboyd/feature/include-static-vectors 2020-10-09 15:54:52 +02:00
Adriane Boyd
2dd79454af Update docs 2020-10-09 14:42:07 +02:00
svlandeg
853edace37 fix MultiHashEmbed example in documentation 2020-10-09 14:11:06 +02:00
Ines Montani
e50dc2c1c9 Update docs [ci skip] 2020-10-09 12:04:52 +02:00
Ines Montani
7c52def5da
Merge pull request #6227 from adrianeboyd/chore/update-3.0.0a36-from-master 2020-10-09 10:49:20 +02:00
Ines Montani
329b61ee7b Update docs [ci skip] 2020-10-09 10:36:06 +02:00
delzac
668507be1b Reflect on usage doc that IS_SENT_START attribute exist (#6114)
* Reflect on usage doc that IS_SENT_START attribute exist

* Create delzac.md
2020-10-09 10:14:40 +02:00
Sofie Van Landeghem
d093d6343b
TrainablePipe (#6213)
* rename Pipe to TrainablePipe

* split functionality between Pipe and TrainablePipe

* remove unnecessary methods from certain components

* cleanup

* hasattr(component, "pipe") should be sufficient again

* remove serialization and vocab/cfg from Pipe

* unify _ensure_examples and validate_examples

* small fixes

* hasattr checks for self.cfg and self.vocab

* make is_resizable and is_trainable properties

* serialize strings.json instead of vocab

* fix KB IO + tests

* fix typos

* more typos

* _added_strings as a set

* few more tests specifically for _added_strings field

* bump to 3.0.0a36
2020-10-08 21:33:49 +02:00
Ines Montani
5ebd1fc2cf Update docs [ci skip] 2020-10-08 16:23:12 +02:00
Ines Montani
d1602e1ece Update docs [ci skip] 2020-10-08 11:56:50 +02:00
Ines Montani
064575d79d
Merge pull request #6216 from svlandeg/feature/nel-initialize 2020-10-08 11:14:12 +02:00