Commit Graph

11622 Commits

Author SHA1 Message Date
Ines Montani
d93cbeb14f
Add warning for loose version constraints (#5536)
* Add warning for loose version constraints

* Update wording [ci skip]

* Tweak error message

Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2020-06-05 12:42:15 +02:00
Matthew Honnibal
8411d4f4e6
Merge pull request #5543 from svlandeg/feature/pretrain-config
pretrain from config
2020-06-04 19:07:12 +02:00
svlandeg
3ade455fd3 formatting 2020-06-04 16:09:55 +02:00
svlandeg
776d4f1190 cleanup 2020-06-04 16:07:30 +02:00
svlandeg
6b027d7689 remove duplicate model definition of tok2vec layer 2020-06-04 15:49:23 +02:00
svlandeg
1775f54a26 small little fixes 2020-06-03 22:17:02 +02:00
svlandeg
07886a3de3 rename init_tok2vec to resume 2020-06-03 22:00:25 +02:00
svlandeg
4ed6278663 small fixes to pretrain config, init_tok2vec TODO 2020-06-03 19:32:40 +02:00
Ines Montani
56a9d1b78c
Merge pull request #5479 from explosion/master-tmp 2020-06-03 15:31:27 +02:00
svlandeg
ddf8244df9 add normalize option to distance metric 2020-06-03 14:52:54 +02:00
svlandeg
ffe0451d09 pretrain from config 2020-06-03 14:45:00 +02:00
Ines Montani
a8875d4a4b Fix typo 2020-06-03 14:42:39 +02:00
Ines Montani
4e0610d0d4 Update warning codes 2020-06-03 14:37:09 +02:00
Ines Montani
810fce3bb1 Merge branch 'develop' into master-tmp 2020-06-03 14:36:59 +02:00
Adriane Boyd
b0ee76264b Remove debugging 2020-06-03 14:20:42 +02:00
Adriane Boyd
1d8168d1fd Fix problems with lower and whitespace in variants
Port relevant changes from #5361:

* Initialize lower flag explicitly

* Handle whitespace words from GoldParse correctly when creating raw
text with orth variants
2020-06-03 14:15:58 +02:00
Adriane Boyd
10d938f221 Update default cfg dir in train CLI 2020-06-03 14:15:50 +02:00
Adriane Boyd
f1f9c8b417 Port train CLI updates
Updates from #5362 and fix from #5387:

* `train`:

  * if training on GPU, only run evaluation/timing on CPU in the first
    iteration

  * if training is aborted, exit with a non-0 exit status
2020-06-03 14:03:43 +02:00
svlandeg
109bbdab98 update config files with separate dropout for Tok2Vec layer 2020-06-03 11:53:59 +02:00
svlandeg
eac12cbb77 make dropout in embed layers configurable 2020-06-03 11:50:16 +02:00
svlandeg
e91485dfc4 add discard_oversize parameter, move optimizer to training subsection 2020-06-03 10:04:16 +02:00
svlandeg
03c58b488c prevent infinite loop, custom warning 2020-06-03 10:00:21 +02:00
svlandeg
6504b7f161 Merge remote-tracking branch 'upstream/develop' into feature/pretrain-config 2020-06-03 08:30:16 +02:00
Matthew Honnibal
f74784575c
Merge pull request #5533 from svlandeg/bugfix/minibatch-oversize
add oversize examples before StopIteration returns
2020-06-02 22:54:38 +02:00
svlandeg
c5ac382f0a fix name clash 2020-06-02 22:24:57 +02:00
svlandeg
2bf5111ecf additional test with discard_oversize=False 2020-06-02 22:09:37 +02:00
svlandeg
aa6271b16c extending algorithm to deal better with edge cases 2020-06-02 22:05:08 +02:00
svlandeg
f2e162fc60 it's only oversized if the tolerance level is also exceeded 2020-06-02 19:59:04 +02:00
svlandeg
ef834b4cd7 fix comments 2020-06-02 19:50:44 +02:00
svlandeg
6208d322d3 slightly more challenging unit test 2020-06-02 19:47:30 +02:00
svlandeg
6651fafd5c using overflow buffer for examples within the tolerance margin 2020-06-02 19:43:39 +02:00
svlandeg
85b0597ed5 add test for minibatch util 2020-06-02 18:26:21 +02:00
svlandeg
5b350a6c99 bugfix of the bugfix 2020-06-02 17:49:33 +02:00
svlandeg
fdfd822936 rewrite minibatch_by_words function 2020-06-02 15:22:54 +02:00
svlandeg
ec52e7f886 add oversize examples before StopIteration returns 2020-06-02 13:21:55 +02:00
svlandeg
e0f9f448f1 remove Tensorizer 2020-06-01 23:38:48 +02:00
Ines Montani
b5ae2edcba
Merge pull request #5516 from explosion/feature/improve-model-version-deps 2020-05-31 12:54:01 +02:00
Matthw Honnibal
cd5f748e09 Add onto-joint experiment file 2020-05-30 20:27:47 +02:00
Matthw Honnibal
d1c2e88d0f Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-05-30 19:23:12 +02:00
Ines Montani
dc186afdc5 Add warning 2020-05-30 15:34:54 +02:00
Ines Montani
2bdf787417 Merge branch 'develop' into feature/improve-model-version-deps 2020-05-30 15:20:20 +02:00
Ines Montani
368182776e Tidy up dependencies 2020-05-30 15:19:53 +02:00
Ines Montani
b7aff6020c Make functions more general purpose and update docstrings and tests 2020-05-30 15:18:53 +02:00
Ines Montani
a7e370bcbf Don't override spaCy version 2020-05-30 15:03:18 +02:00
Ines Montani
e47e5a4b10 Use more sophisticated version parsing logic 2020-05-30 15:01:58 +02:00
Ines Montani
bed62991ad Tidy up requirements 2020-05-30 14:59:55 +02:00
Ines Montani
4fd087572a WIP: improve model version deps 2020-05-28 12:51:37 +02:00
Matthw Honnibal
58750b06f8 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-05-27 22:18:36 +02:00
Matthew Honnibal
a44d51a3d8
Merge pull request #5496 from explosion/docs/unicode-str
unicode -> str consistency
2020-05-26 10:30:37 +02:00
Ines Montani
1a15896ba9 unicode -> str consistency [ci skip] 2020-05-24 18:51:10 +02:00