Ines Montani
8a67ddd6f1
Remove unused import
2020-07-12 12:32:24 +02:00
Ines Montani
d1d7fd5f5d
Don't use file paths in schemas
...
It should be possible to validate top-level config with file paths that don't exist
2020-07-12 12:32:08 +02:00
Ines Montani
79346853aa
Add debug-config command
2020-07-12 12:31:17 +02:00
Ines Montani
3a8632c3fb
Hide command from public --help for now
...
Not sure we want this to be officially documented yet?
2020-07-11 19:21:22 +02:00
Ines Montani
5e683d03fe
Allow extra args on pretrain and debug_data
2020-07-11 19:17:59 +02:00
Ines Montani
70abcca60e
Update Thinc pin
2020-07-11 17:02:54 +02:00
Ines Montani
b7111da1d7
Update config and commands
2020-07-11 13:03:53 +02:00
Ines Montani
11bbc82c24
Update cli.md [ci skip]
2020-07-10 23:37:52 +02:00
Ines Montani
9e48ea48a1
Update Thinc pin
2020-07-10 23:34:57 +02:00
Ines Montani
f99ce7fbfb
Make validation errors more elegant
2020-07-10 23:34:17 +02:00
Ines Montani
9455b060d2
Update cli.md
2020-07-10 22:57:22 +02:00
Ines Montani
7b5717cac3
Merge branch 'develop' into feature/refactor-config-args
2020-07-10 22:50:07 +02:00
Ines Montani
e6a6587a9a
Update projects.md [ci skip]
2020-07-10 22:41:27 +02:00
Matthew Honnibal
743f7fb73a
Set version to v3.0.0a4
2020-07-10 22:40:12 +02:00
Matthew Honnibal
b68216e263
Explicitly delete objects after parser.update to free GPU memory ( #5748 )
...
* Try explicitly deleting objects
* Refactor parser model backprop slightly
* Free parser data explicitly after rehearse and update
2020-07-10 22:35:20 +02:00
Ines Montani
f2cd982e7b
Update training.md
2020-07-10 22:34:27 +02:00
Ines Montani
fb6f6f584e
Replace - with _ in command names
...
We might as well be nice if user accidentally types --training.use-gpu
2020-07-10 22:34:22 +02:00
Ines Montani
bfa8e11ffa
Update and auto-format
2020-07-10 20:52:00 +02:00
Ines Montani
0389c34b81
Merge branch 'develop' into feature/refactor-config-args
2020-07-10 20:51:52 +02:00
Ines Montani
931250e1f5
Fix pipeline component schema
2020-07-10 20:32:53 +02:00
Ines Montani
9fe1fa88ad
Fix typo
2020-07-10 20:32:37 +02:00
Ines Montani
459c6aa8f0
Merge branch 'feature/refactor-config-args' of https://github.com/explosion/spaCy into feature/refactor-config-args
2020-07-10 20:01:28 +02:00
Ines Montani
defe1e7213
Pretty-print config validation errors
2020-07-10 20:01:20 +02:00
Matthew Honnibal
894f31226b
Update config
2020-07-10 19:59:12 +02:00
Sofie Van Landeghem
de6a32315c
debug-model script ( #5749 )
...
* adding debug-model to print the internals for debugging purposes
* expend debug-model script with 4 stages: before, init, train, predict
* avoid enforcing to have a seed in the train script
* small fixes
2020-07-10 19:47:53 +02:00
Ines Montani
a3667394b4
Integrate with latest Thinc and config overrides
2020-07-10 19:47:05 +02:00
Ines Montani
5cfc3edcaa
Update CLI tests
2020-07-10 18:21:01 +02:00
Ines Montani
3583ea84d8
Update arg parsing
2020-07-10 18:20:52 +02:00
Ines Montani
73332ddb67
Update CLI commans to use one shared util file
2020-07-10 17:57:40 +02:00
Ines Montani
240e0a62ca
Update with WIP
2020-07-10 13:31:27 +02:00
Ines Montani
a60562f208
Update project CLI hashes, directories, skipping ( #5741 )
...
* Update project CLI hashes, directories, skipping
* Improve clone success message
* Remove unused context args
* Move project-specific utils to project utils
The hashing/checksum functions may not end up being general-purpose functions and are more designed for the projects, so they shouldn't live in spacy.util
* Improve run help and add workflows
* Add note re: directory checksum speed
* Fix cloning from subdirectories and output messages
* Remove hard-coded dirs
2020-07-09 23:51:18 +02:00
Ines Montani
e624fcd5d9
Merge branch 'nightly.spacy.io' into develop
2020-07-09 23:26:26 +02:00
Ines Montani
52e9b5b472
Fix formatting
2020-07-09 23:25:58 +02:00
Ines Montani
28cdae898a
Update projects.md
2020-07-09 22:35:54 +02:00
Adriane Boyd
0a62098c5f
Fix lemmatizer is_base_form for python2.7 ( #5734 )
...
* Fix lemmatizer init args for python2.7
* Move English is_base_form to a class method
* Skip test pickling PhraseMatcher for python2
2020-07-09 22:11:24 +02:00
Adriane Boyd
923affd091
Remove is_base_form from French lemmatizer ( #5733 )
...
Remove English-specific is_base_form from French lemmatizer.
2020-07-09 22:11:13 +02:00
Ines Montani
7bcf9f7cfb
Document new features
2020-07-09 21:10:36 +02:00
Ines Montani
797ca6f3dd
Merge branch 'develop' into nightly.spacy.io
2020-07-09 20:48:24 +02:00
Matthew Honnibal
552d1ad226
Hack at tests
2020-07-09 20:25:51 +02:00
Matthew Honnibal
eb064c59cd
Try to fix textcat test
2020-07-09 20:24:53 +02:00
Ines Montani
018319a640
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2020-07-09 19:44:41 +02:00
Ines Montani
05e182e421
Update CLI args and docstrings
2020-07-09 19:44:28 +02:00
Sofie Van Landeghem
dd207a28be
cleanup components API ( #5726 )
...
* add keyword separator for update functions and drop unused "state"
* few more Example tests and various small fixes
* consistently return losses after update call
* eliminate unused tensors field across pipe components
* fix name
* fix arg name
2020-07-09 19:43:39 +02:00
Ines Montani
ea01831f6a
Update projects docs etc.
2020-07-09 19:43:25 +02:00
Adriane Boyd
ac4297ee39
Minor refactor to conversion of output docs ( #5718 )
...
Minor refactor of conversion of docs to output format to avoid
duplicate conversion steps.
2020-07-09 19:42:32 +02:00
Sofie Van Landeghem
c1ea55307b
Fixing reproducible training ( #5735 )
...
* Add initial reproducibility tests
* failing test for default_text_classifier (WIP)
* track trouble to underlying tok2vec layer
* add regression test for Issue 5551
* tests go green with https://github.com/explosion/thinc/pull/359
* update test
* adding fixed seeds to HashEmbed layers, seems to fix the reproducility issue
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2020-07-09 19:39:31 +02:00
Matthew Honnibal
1827f22f56
Set version to v3.0.0a3
2020-07-09 19:38:04 +02:00
Matthw Honnibal
7010f1a2be
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2020-07-09 19:34:11 +02:00
Matthw Honnibal
0becc5954b
Update NER config
2020-07-09 19:33:54 +02:00
Matthw Honnibal
77af0a6bb4
Offer option of padding-sensitive batching
2020-07-09 14:50:20 +02:00