Adriane Boyd
1d8168d1fd
Fix problems with lower and whitespace in variants
...
Port relevant changes from #5361 :
* Initialize lower flag explicitly
* Handle whitespace words from GoldParse correctly when creating raw
text with orth variants
2020-06-03 14:15:58 +02:00
Adriane Boyd
10d938f221
Update default cfg dir in train CLI
2020-06-03 14:15:50 +02:00
Adriane Boyd
f1f9c8b417
Port train CLI updates
...
Updates from #5362 and fix from #5387 :
* `train`:
* if training on GPU, only run evaluation/timing on CPU in the first
iteration
* if training is aborted, exit with a non-0 exit status
2020-06-03 14:03:43 +02:00
svlandeg
109bbdab98
update config files with separate dropout for Tok2Vec layer
2020-06-03 11:53:59 +02:00
svlandeg
eac12cbb77
make dropout in embed layers configurable
2020-06-03 11:50:16 +02:00
svlandeg
e91485dfc4
add discard_oversize parameter, move optimizer to training subsection
2020-06-03 10:04:16 +02:00
svlandeg
03c58b488c
prevent infinite loop, custom warning
2020-06-03 10:00:21 +02:00
svlandeg
6504b7f161
Merge remote-tracking branch 'upstream/develop' into feature/pretrain-config
2020-06-03 08:30:16 +02:00
Matthew Honnibal
f74784575c
Merge pull request #5533 from svlandeg/bugfix/minibatch-oversize
...
add oversize examples before StopIteration returns
2020-06-02 22:54:38 +02:00
svlandeg
c5ac382f0a
fix name clash
2020-06-02 22:24:57 +02:00
svlandeg
2bf5111ecf
additional test with discard_oversize=False
2020-06-02 22:09:37 +02:00
svlandeg
aa6271b16c
extending algorithm to deal better with edge cases
2020-06-02 22:05:08 +02:00
svlandeg
f2e162fc60
it's only oversized if the tolerance level is also exceeded
2020-06-02 19:59:04 +02:00
svlandeg
ef834b4cd7
fix comments
2020-06-02 19:50:44 +02:00
svlandeg
6208d322d3
slightly more challenging unit test
2020-06-02 19:47:30 +02:00
svlandeg
6651fafd5c
using overflow buffer for examples within the tolerance margin
2020-06-02 19:43:39 +02:00
svlandeg
85b0597ed5
add test for minibatch util
2020-06-02 18:26:21 +02:00
svlandeg
5b350a6c99
bugfix of the bugfix
2020-06-02 17:49:33 +02:00
svlandeg
fdfd822936
rewrite minibatch_by_words function
2020-06-02 15:22:54 +02:00
svlandeg
ec52e7f886
add oversize examples before StopIteration returns
2020-06-02 13:21:55 +02:00
svlandeg
e0f9f448f1
remove Tensorizer
2020-06-01 23:38:48 +02:00
Ines Montani
b5ae2edcba
Merge pull request #5516 from explosion/feature/improve-model-version-deps
2020-05-31 12:54:01 +02:00
Matthw Honnibal
cd5f748e09
Add onto-joint experiment file
2020-05-30 20:27:47 +02:00
Matthw Honnibal
d1c2e88d0f
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2020-05-30 19:23:12 +02:00
Ines Montani
dc186afdc5
Add warning
2020-05-30 15:34:54 +02:00
Ines Montani
2bdf787417
Merge branch 'develop' into feature/improve-model-version-deps
2020-05-30 15:20:20 +02:00
Ines Montani
368182776e
Tidy up dependencies
2020-05-30 15:19:53 +02:00
Ines Montani
b7aff6020c
Make functions more general purpose and update docstrings and tests
2020-05-30 15:18:53 +02:00
Ines Montani
a7e370bcbf
Don't override spaCy version
2020-05-30 15:03:18 +02:00
Ines Montani
e47e5a4b10
Use more sophisticated version parsing logic
2020-05-30 15:01:58 +02:00
Ines Montani
bed62991ad
Tidy up requirements
2020-05-30 14:59:55 +02:00
Ines Montani
4fd087572a
WIP: improve model version deps
2020-05-28 12:51:37 +02:00
Matthw Honnibal
58750b06f8
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2020-05-27 22:18:36 +02:00
Matthew Honnibal
a44d51a3d8
Merge pull request #5496 from explosion/docs/unicode-str
...
unicode -> str consistency
2020-05-26 10:30:37 +02:00
Ines Montani
1a15896ba9
unicode -> str consistency [ci skip]
2020-05-24 18:51:10 +02:00
Ines Montani
262d306eaa
unicode -> str consistency
2020-05-24 17:23:00 +02:00
Ines Montani
5d3806e059
unicode -> str consistency
2020-05-24 17:20:58 +02:00
Ines Montani
cf156ed2f4
Merge pull request #5495 from explosion/fix/simplify-is-package
2020-05-24 15:42:55 +02:00
Ines Montani
387c7aba15
Update test
2020-05-24 14:55:16 +02:00
Ines Montani
f9786d765e
Simplify is_package check
2020-05-24 14:48:56 +02:00
Ines Montani
15d3a0ac3a
Merge pull request #5491 from explosion/chore/rename-pipe-analysis
2020-05-23 12:41:54 +02:00
Matthw Honnibal
2d9de8684d
Support use_pytorch_for_gpu_memory config
2020-05-22 23:10:40 +02:00
Ines Montani
4465cad6c5
Rename spacy.analysis to spacy.pipe_analysis
2020-05-22 17:42:06 +02:00
Ines Montani
25d6ed3fb8
Merge pull request #5489 from explosion/feature/connected-components
2020-05-22 17:40:11 +02:00
Ines Montani
841c05b47b
Merge pull request #5490 from explosion/fix/remove-jsonschema
2020-05-22 17:39:54 +02:00
Ines Montani
569a65b60e
Auto-format
2020-05-22 16:55:42 +02:00
Ines Montani
d844528c5f
Add test for is_compatible_model
2020-05-22 16:55:15 +02:00
Ines Montani
12b7be1d98
Remove jsonschema from dependencies
2020-05-22 16:49:26 +02:00
Matthew Honnibal
7a73a9dcf6
Merge pull request #5488 from explosion/feature/better-model-compat
...
Better model compatibility and validation
2020-05-22 16:44:29 +02:00
Matthew Honnibal
f7f6df7275
Move to spacy.analysis
2020-05-22 16:43:18 +02:00