Commit Graph

598 Commits

Author SHA1 Message Date
Ines Montani
5d235fb767 Merge branch 'develop' into feature/project-cli 2020-06-25 12:27:58 +02:00
Ines Montani
01c394eb23 Update to latest Typer and remove hacks 2020-06-25 12:27:19 +02:00
Ines Montani
82a03ee18e Replace python with sys.executable 2020-06-25 12:26:53 +02:00
Ines Montani
8131a65dee Update __init__.py 2020-06-22 16:09:09 +02:00
Ines Montani
2ad7a02400 Merge branch 'develop' into feature/project-cli 2020-06-22 15:33:11 +02:00
Ines Montani
0ee6d7a4d1 Remove project stuff from this branch 2020-06-22 14:54:38 +02:00
Ines Montani
a6b76440b7 Update project CLI 2020-06-22 14:53:31 +02:00
Ines Montani
3f2f5f9cb3 Remove ml_datasets from install dependencies 2020-06-22 12:14:51 +02:00
Ines Montani
dc5d535659 Tidy up info 2020-06-22 01:17:11 +02:00
Ines Montani
189ed56777 Fix and simplify info 2020-06-22 01:07:48 +02:00
Ines Montani
fca3907d4e Add correct uppercase variants for boolean flags 2020-06-22 00:57:28 +02:00
Ines Montani
79dd824906 Tidy up 2020-06-22 00:45:40 +02:00
Ines Montani
1e5b4d8524 Fix DVC check 2020-06-22 00:30:05 +02:00
Ines Montani
5ba1df5e78 Update project CLI 2020-06-22 00:15:06 +02:00
Ines Montani
275bab62df Refactor CLI 2020-06-21 21:35:01 +02:00
Ines Montani
c12713a8be Port CLI to Typer and add project stubs 2020-06-21 13:44:00 +02:00
Ines Montani
988d2a4eda
Add --code-path option to train CLI (#5618) 2020-06-20 18:43:12 +02:00
Ines Montani
8283df80e9 Tidy up and auto-format 2020-06-20 14:15:04 +02:00
Matthew Honnibal
a1c5b694be Small fixes to train defaults 2020-06-12 02:22:13 +02:00
Sofie Van Landeghem
c0f4a1e43b
train is from-config by default (#5575)
* verbose and tag_map options

* adding init_tok2vec option and only changing the tok2vec that is specified

* adding omit_extra_lookups and verifying textcat config

* wip

* pretrain bugfix

* add replace and resume options

* train_textcat fix

* raw text functionality

* improve UX when KeyError or when input data can't be parsed

* avoid unnecessary access to goldparse in TextCat pipe

* save performance information in nlp.meta

* add noise_level to config

* move nn_parser's defaults to config file

* multitask in config - doesn't work yet

* scorer offering both F and AUC options, need to be specified in config

* add textcat verification code from old train script

* small fixes to config files

* clean up

* set default config for ner/parser to allow create_pipe to work as before

* two more test fixes

* small fixes

* cleanup

* fix NER pickling + additional unit test

* create_pipe as before
2020-06-12 02:02:07 +02:00
Matthew Honnibal
8411d4f4e6
Merge pull request #5543 from svlandeg/feature/pretrain-config
pretrain from config
2020-06-04 19:07:12 +02:00
svlandeg
3ade455fd3 formatting 2020-06-04 16:09:55 +02:00
svlandeg
776d4f1190 cleanup 2020-06-04 16:07:30 +02:00
svlandeg
6b027d7689 remove duplicate model definition of tok2vec layer 2020-06-04 15:49:23 +02:00
svlandeg
1775f54a26 small little fixes 2020-06-03 22:17:02 +02:00
svlandeg
07886a3de3 rename init_tok2vec to resume 2020-06-03 22:00:25 +02:00
svlandeg
4ed6278663 small fixes to pretrain config, init_tok2vec TODO 2020-06-03 19:32:40 +02:00
svlandeg
ffe0451d09 pretrain from config 2020-06-03 14:45:00 +02:00
Ines Montani
810fce3bb1 Merge branch 'develop' into master-tmp 2020-06-03 14:36:59 +02:00
Adriane Boyd
10d938f221 Update default cfg dir in train CLI 2020-06-03 14:15:50 +02:00
Adriane Boyd
f1f9c8b417 Port train CLI updates
Updates from #5362 and fix from #5387:

* `train`:

  * if training on GPU, only run evaluation/timing on CPU in the first
    iteration

  * if training is aborted, exit with a non-0 exit status
2020-06-03 14:03:43 +02:00
svlandeg
e91485dfc4 add discard_oversize parameter, move optimizer to training subsection 2020-06-03 10:04:16 +02:00
svlandeg
03c58b488c prevent infinite loop, custom warning 2020-06-03 10:00:21 +02:00
Ines Montani
b5ae2edcba
Merge pull request #5516 from explosion/feature/improve-model-version-deps 2020-05-31 12:54:01 +02:00
Ines Montani
b7aff6020c Make functions more general purpose and update docstrings and tests 2020-05-30 15:18:53 +02:00
Ines Montani
a7e370bcbf Don't override spaCy version 2020-05-30 15:03:18 +02:00
Ines Montani
e47e5a4b10 Use more sophisticated version parsing logic 2020-05-30 15:01:58 +02:00
Ines Montani
4fd087572a WIP: improve model version deps 2020-05-28 12:51:37 +02:00
Matthw Honnibal
58750b06f8 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-05-27 22:18:36 +02:00
Ines Montani
1a15896ba9 unicode -> str consistency [ci skip] 2020-05-24 18:51:10 +02:00
Ines Montani
5d3806e059 unicode -> str consistency 2020-05-24 17:20:58 +02:00
Ines Montani
f9786d765e Simplify is_package check 2020-05-24 14:48:56 +02:00
Matthw Honnibal
2d9de8684d Support use_pytorch_for_gpu_memory config 2020-05-22 23:10:40 +02:00
Ines Montani
6e6db6afb6 Better model compatibility and validation 2020-05-22 15:42:46 +02:00
Matthw Honnibal
3b5cfec1fc Tweak memory management in train_from_config 2020-05-21 19:32:04 +02:00
Ines Montani
24f72c669c Merge branch 'develop' into master-tmp 2020-05-21 18:39:06 +02:00
Matthew Honnibal
e6c4c1a507
Merge pull request #5468 from adrianeboyd/feature/cli-conllu-misc-ner
Improve handling of NER in CoNLL-U MISC
2020-05-21 16:39:46 +02:00
Matthew Honnibal
cad9b290a2
Merge branch 'master' into feature/omit-extra-lexeme-info 2020-05-21 16:04:24 +02:00
Ines Montani
d8f3190c0a Tidy up and auto-format 2020-05-21 14:14:01 +02:00
adrianeboyd
d45602bc11
Merge branch 'master' into feature/omit-extra-lexeme-info 2020-05-21 10:26:01 +02:00