Ines Montani
c052017025
Fix sparse checkout and error handling
2020-09-14 14:12:58 +02:00
Ines Montani
416deb412f
Prevent duplicate traceback on CalledProcessError [ci skip]
2020-09-13 19:28:54 +02:00
Ines Montani
f8846c198d
Update types and docstrings
2020-09-13 10:52:02 +02:00
Ines Montani
3e83a509bb
WIP: fix project clone compatibility
2020-09-10 15:49:13 +02:00
Matthew Honnibal
b470062153
Add CLI registry ( #6037 )
2020-09-08 15:23:34 +02:00
Ines Montani
5afe6447cd
registry.assets -> registry.misc
2020-09-03 17:31:14 +02:00
Ines Montani
45f46a5c85
Merge pull request #5993 from explosion/feature/disabled-components
2020-08-29 15:58:41 +02:00
Ines Montani
34146750d4
Use frozen list with custom errors
...
We don't want to break backwards compatibility too much but we also want to provide the best possible UX
2020-08-29 15:20:11 +02:00
Ines Montani
5de3f8604d
Update spacy/util.py
...
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2020-08-29 13:17:06 +02:00
Ines Montani
cad988da7f
Allow component decorators to re-run with same function
2020-08-28 16:27:22 +02:00
Ines Montani
3ce5be4b76
Allow loaded but disabled components
2020-08-28 15:20:14 +02:00
Sofie Van Landeghem
79d460e3a2
Weights & Biases logger for train CLI ( #5971 )
...
* quick test as part of train script
* train_logger in config, default ConsoleLogger in loggers catalogue
* entitiy typo
* add wandb_logger
* cleanup
* Update spacy/cli/train_logger.py
Co-authored-by: Ines Montani <ines@ines.io>
* move loggers to gold.loggers
Co-authored-by: Ines Montani <ines@ines.io>
2020-08-26 15:24:33 +02:00
Matthew Honnibal
77852d2428
Fix run_command for python 3.6
2020-08-26 05:02:43 +02:00
Matthew Honnibal
884cac5fb5
Make run_command backwards compatible
2020-08-26 04:33:42 +02:00
Matthew Honnibal
2771e4f2b3
Fix the git "sparse checkout" functionality ( #5973 )
...
* Fix the git sparse checkout functionality
* Format
2020-08-26 04:00:14 +02:00
Matthew Honnibal
e559867605
Allow spacy project to push and pull to/from remote storage ( #5949 )
...
* Add utils for working with remote storage
* WIP add remote_cache for project
* WIP add push and pull commands
* Use pathy in remote_cache
* Updarte util
* Update remote_cache
* Update util
* Update project assets
* Update pull script
* Update push script
* Fix type annotation in util
* Work on remote storage
* Remove site and env hash
* Fix imports
* Fix type annotation
* Require pathy
* Require pathy
* Fix import
* Add a util to handle project variable substitution
* Import push and pull commands
* Fix pull command
* Fix push command
* Fix tarfile in remote_storage
* Improve printing
* Fiddle with status messages
* Set version to v3.0.0a9
* Draft docs for spacy project remote storages
* Update docs [ci skip]
* Use Thinc config to simplify and unify template variables
* Auto-format
* Don't import Pathy globally for now
Causes slow and annoying Google Cloud warning
* Tidy up test
* Tidy up and update tests
* Update to latest Thinc
* Update docs
* variables -> vars
* Update docs [ci skip]
* Update docs [ci skip]
Co-authored-by: Ines Montani <ines@ines.io>
2020-08-23 18:32:09 +02:00
Ines Montani
1c3bcfb488
Update docs and util consistency
2020-08-18 01:22:59 +02:00
Ines Montani
3ae5e02f4f
Update docs, types and API consistency
2020-08-17 16:45:24 +02:00
Ines Montani
45f13cbf64
Merge pull request #5916 from explosion/feature/new-thinc-config
2020-08-16 15:24:12 +02:00
Ines Montani
8128e5eb35
Replace lexeme_norm warning with logging
2020-08-14 15:00:52 +02:00
Ines Montani
37814b608d
Remove env_opt and simplfy default Optimizer
2020-08-14 14:59:54 +02:00
Ines Montani
67cc39af7f
Update Thinc and include section order
2020-08-14 14:06:22 +02:00
Ines Montani
88b0a96801
Update for new Thinc and adjust config
2020-08-13 17:38:30 +02:00
Ines Montani
913d21f0a3
Merge pull request #5882 from explosion/feature/raise-from
...
Use "raise ... from" in custom errors for better tracebacks
2020-08-06 00:35:26 +02:00
Ines Montani
d92954ac1d
Merge pull request #5881 from explosion/feature/better-error-model-shortcuts
2020-08-06 00:13:35 +02:00
Ines Montani
56c17973aa
Use "raise ... from" in custom errors for better tracebacks
2020-08-05 23:53:21 +02:00
Ines Montani
5cc0d89fad
Simplify config overrides in CLI and deserialization ( #5880 )
2020-08-05 23:35:09 +02:00
Ines Montani
2a1fa86a0d
Add better error for failed model shortcut loading
2020-08-05 23:10:29 +02:00
Ines Montani
823e533dc1
Add config callbacks for modifying nlp object before and after init ( #5866 )
...
* WIP: Concept for modifying nlp object before and after init
* Make callbacks return nlp object
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
* Raise if callbacks don't return correct type
* Rename, update types, add after_pipeline_creation
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2020-08-05 19:47:54 +02:00
Ines Montani
e68459296d
Tidy up and auto-format
2020-08-05 16:00:59 +02:00
Ines Montani
b795f02fbd
Allow adding pipeline components from source model ( #5857 )
...
* Allow adding pipeline components from source model
* Config: name -> component
* Improve error messages
* Fix error and test
* Add frozen components and exclude logic
* Remove exclude from Language.evaluate
* Init sourced components with current vocab
* Fix error codes
2020-08-04 23:39:19 +02:00
Matthew Honnibal
ecb3c4e8f4
Create corpus iterator and batcher from registry during training ( #5865 )
...
* Move batchers into their own module (and registry)
* Update CLI
* Update Corpus and batcher
* Update tests
* Update one config
* Merge 'evaluation' block back under [training]
* Import batchers in gold __init__
* Fix batchers
* Update config
* Update schema
* Update util
* Don't assume train and dev are actually paths
* Update onto-joint config
* Fix missing import
* Format
* Format
* Update spacy/gold/corpus.py
Co-authored-by: Ines Montani <ines@ines.io>
* Fix name
* Update default config
* Fix get_length option in batchers
* Update test
* Add comment
* Pass path into Corpus
* Update docstring
* Update schema and configs
* Update config
* Fix test
* Fix paths
* Fix print
* Fix create_train_batches
* [training.read_train] -> [training.train_corpus]
* Update onto-joint config
Co-authored-by: Ines Montani <ines@ines.io>
2020-08-04 15:09:37 +02:00
Ines Montani
e9e8fa2466
Update docs and types
2020-07-31 17:02:54 +02:00
Matthew Honnibal
1784c95827
Clean up link_vectors_to_models unused stuff
2020-07-29 14:01:11 +02:00
Matthew Honnibal
0c17ea4c85
Format
2020-07-29 14:00:13 +02:00
Matthew Honnibal
7852a68a75
Fix load_vectors_into_model function
2020-07-29 14:00:13 +02:00
Matthew Honnibal
df95e2af64
Add load_vectors_into_model util
2020-07-29 14:00:12 +02:00
Matthew Honnibal
acc64e138a
Add import
2020-07-29 14:00:11 +02:00
Matthew Honnibal
cb9654e98c
WIP on new StaticVectors
2020-07-29 14:00:09 +02:00
Ines Montani
ba22111ff4
Move error to Errors
2020-07-28 16:24:14 +02:00
Ines Montani
b83ead5bf5
Merge pull request #5824 from svlandeg/fix/textcat-v3
2020-07-28 15:04:25 +02:00
Ines Montani
ae4d8a6ffd
Update docstrings, docs and pipe consistency
2020-07-28 13:37:31 +02:00
svlandeg
61068e0fb1
util function dot_to_object and corresponding unit test
2020-07-27 17:50:12 +02:00
Adriane Boyd
8bb0507777
Add and update score methods and score weights
...
Add and update `score` methods, provided `scores`, and default weights
`default_score_weights` for pipeline components.
* `scores` provides all top-level keys returned by `score` (merely informative, similar to `assigns`).
* `default_score_weights` provides the default weights for a default config.
* The keys from `default_score_weights` determine which values will be
shown in the `spacy train` output, so keys with weight `0.0` will be
displayed but not counted toward the overall score.
2020-07-27 14:44:53 +02:00
Adriane Boyd
f8cf378be9
Combine weights from multiple components
...
Combine weights from multiple components for the same score.
2020-07-27 10:21:31 +02:00
Ines Montani
2470486543
Allow pipeline components to set default scores and weights
2020-07-26 13:18:43 +02:00
Ines Montani
e92df281ce
Tidy up, autoformat, add types
2020-07-25 15:01:15 +02:00
Ines Montani
c003d26b94
Tidy up
2020-07-25 12:21:37 +02:00
Ines Montani
8d9d28eb8b
Re-add setting for vocab data and tidy up
2020-07-25 12:14:28 +02:00
Ines Montani
b9aaa4e457
Improve vocab data integration and warning
2020-07-25 11:51:30 +02:00