Matthew Honnibal
a44c4c3a5b
Add timer to evaluate
2017-10-03 09:15:35 -05:00
Matthew Honnibal
8902df44de
Fix component disabling during training
2017-10-02 21:07:23 +02:00
Matthew Honnibal
c617d288d8
Update pipeline component names in spaCy train
2017-10-02 17:20:19 +02:00
Matthew Honnibal
f942903429
Improve sentence merging in iob2json
2017-10-02 17:02:10 +02:00
Matthew Honnibal
31681d20e0
Fix concatenation in iob2json converter
2017-10-02 16:50:26 +02:00
Matthew Honnibal
4896ce3320
Remove misleading comment
2017-10-02 00:09:14 +02:00
Matthew Honnibal
94df115a81
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-01 14:06:23 -05:00
Matthew Honnibal
69c7c642c2
Add spacy evaluate
2017-10-01 14:05:04 -05:00
ines
fd1a9225d8
Handle conversion of pipeline components correctly
...
Allow both comma and comma + whitespace as separators
2017-09-29 20:52:56 +02:00
Matthew Honnibal
ac8481a7b0
Print NER loss
2017-09-28 08:05:31 -05:00
Matthew Honnibal
542ebfa498
Improve defaults
2017-09-27 18:54:37 -05:00
Matthew Honnibal
dcb86bdc43
Default batch size to 32
2017-09-27 11:48:19 -05:00
ines
1ff62eaee7
Fix option shortcut to avoid conflict
2017-09-26 17:59:34 +02:00
ines
7fdfb78141
Add version option to cli.train
2017-09-26 17:34:52 +02:00
Matthew Honnibal
698fc0d016
Remove merge artefact
2017-09-26 08:31:37 -05:00
Matthew Honnibal
defb68e94f
Update feature/noshare with recent develop changes
2017-09-26 08:15:14 -05:00
ines
edf7e4881d
Add meta.json option to cli.train and add relevant properties
...
Add accuracy scores to meta.json instead of accuracy.json and replace
all relevant properties like lang, pipeline, spacy_version in existing
meta.json. If not present, also add name and version placeholders to
make it packagable.
2017-09-25 19:00:47 +02:00
Matthew Honnibal
204b58c864
Fix evaluation during training
2017-09-24 05:01:03 -05:00
Matthew Honnibal
dc3a623d00
Remove unused update_shared argument
2017-09-24 05:00:37 -05:00
Matthew Honnibal
4348c479fc
Merge pre-trained vectors and noshare patches
2017-09-22 20:07:28 -05:00
Matthew Honnibal
e93d43a43a
Fix training with preset vectors
2017-09-22 20:00:40 -05:00
Matthew Honnibal
a2357cce3f
Set random seed in train script
2017-09-23 02:57:31 +02:00
Matthew Honnibal
0a9016cade
Fix serialization during training
2017-09-21 13:06:45 -05:00
Matthew Honnibal
20193371f5
Don't share CNN, to reduce complexities
2017-09-21 14:59:48 +02:00
Matthew Honnibal
1d73dec8b1
Refactor train script
2017-09-20 19:17:10 -05:00
Matthew Honnibal
a0c4b33d03
Support resuming a model during spacy train
2017-09-18 18:04:47 -05:00
Matthew Honnibal
8496d76224
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-09-14 09:21:20 -05:00
Matthew Honnibal
24ff6b0ad9
Fix parsing and tok2vec models
2017-09-06 05:50:58 -05:00
Matthew Honnibal
e920885676
Fix pickle during train
2017-09-02 12:46:01 -05:00
ines
7e04b7f89c
Fix info text on pipeline in package cli
2017-08-26 18:30:59 +02:00
Matthew Honnibal
876f38c548
Merge pull request #1279 from oroszgy/model_cli_v2
...
Added vector loading to model cli
2017-08-26 15:57:50 +02:00
ines
bb1abbeba5
Only link model if download was successfull
2017-08-23 12:36:31 +02:00
Matthew Honnibal
7be5f30f17
Add profile function
2017-08-21 23:22:49 +02:00
Gyorgy Orosz
b3576bfc86
Added vector leading to model cli
2017-08-20 23:16:12 +02:00
Matthew Honnibal
7a6edeea68
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-20 12:55:39 -05:00
Matthew Honnibal
f2f9229964
Fix name of update_shared flag
2017-08-20 18:19:06 +02:00
Matthew Honnibal
80a5146ec2
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-08-20 11:07:08 -05:00
Matthew Honnibal
84bb543e4d
Add gold_preproc flag to cli/train
2017-08-20 11:07:00 -05:00
Gyorgy Orosz
e5344b83a3
Ported model cli from v1
2017-08-19 21:45:23 +02:00
Matthew Honnibal
11c31d285c
Restore changes from nn-beam-parser
2017-08-18 22:26:12 +02:00
Matthew Honnibal
52c180ecf5
Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
...
This reverts commit ea8de11ad5
, reversing
changes made to 08e443e083
.
2017-08-14 13:00:23 +02:00
Matthew Honnibal
4ae0d5e1e6
Set defaults for convert command
2017-08-13 09:03:38 +02:00
ines
d4f2baf7dd
Add create_meta option to package command
...
Re-create meta.json in model directory, even if it exists. Especially
useful when updating existing spaCy models or training with Prodigy.
Ensures user won't end up with multiple "en_core_web_sm" models, and
offers easy way to change the model's name and settings without having
to edit the meta.json file.
2017-08-12 21:44:18 +02:00
Matthew Honnibal
8870d491f1
Remove redundant pickling during training
2017-08-12 08:55:53 -05:00
ines
28e2fec23b
Fix autolinking failure on fresh model install ( resolves #1138 )
...
On fresh install via subprocess, pip.get_installed_distributions()
won't show new model, so is_package check in link command fails.
Solution for now is to get model package path explicitly and pass it to
link command.
2017-08-09 11:52:38 +02:00
Matthew Honnibal
0a566dc320
Add update_tensors flag to Language.update. Experimental, re #1182
2017-08-06 02:18:12 +02:00
György Orosz
62dbf9025c
Fixed conllu converter
2017-06-09 22:53:56 +02:00
ines
03db56f48c
Detect spaCy version and add package title
...
Package title allows customised package names (like spacy-nightly)
2017-06-05 20:11:02 +02:00
Matthew Honnibal
c52fde40f4
Improve train CLI
2017-06-04 20:18:37 -05:00
ines
848e47669e
Fix typo
2017-06-04 20:44:15 +02:00
ines
7b7d46b64e
Fix typo and success message
2017-06-04 13:45:50 +02:00
Matthew Honnibal
21eef90dbc
Support specifying which GPU
2017-06-03 16:10:23 -05:00
Matthew Honnibal
43353b5413
Improve train CLI script
2017-06-03 13:28:20 -05:00
ines
e5ae6ccf4e
Fix typo
2017-06-01 16:46:15 +02:00
Matthew Honnibal
8a693c2605
Write binary file during training
2017-05-31 02:59:18 +02:00
ines
9e83a17e95
Use new model templates
2017-05-29 15:27:24 +02:00
Matthew Honnibal
8a24c60c1e
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-05-28 08:12:05 -05:00
Matthew Honnibal
5cf47b847b
Handle iob with no tag in converter
2017-05-28 08:11:39 -05:00
ines
c1983621fb
Update util functions for model loading
2017-05-28 00:22:40 +02:00
Matthew Honnibal
49235017bf
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-05-27 16:34:28 -05:00
Matthew Honnibal
5e4312feed
Evaluate loaded class, to ensure save/load works
2017-05-27 15:47:02 -05:00
Matthew Honnibal
7cc9c3e9a6
Fix convert CLI
2017-05-27 15:44:42 -05:00
ines
1203959625
Add pipeline setting to meta.json generator
2017-05-27 20:02:01 +02:00
ines
086a06e7d7
Fix CLI docstrings and add command as first argument
...
Workaround for Plac
2017-05-27 20:01:46 +02:00
Matthew Honnibal
dc07d72d80
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-05-27 08:20:40 -05:00
Matthew Honnibal
de13fe0305
Remove length cap on sentences
2017-05-27 08:20:32 -05:00
Matthew Honnibal
d06f235fc9
Fix conflict on convert.py
2017-05-26 11:33:29 -05:00
Matthew Honnibal
2b3b937a04
Fix converter CLI
2017-05-26 11:32:41 -05:00
Matthew Honnibal
5a87bcf35f
Fix converters
2017-05-26 11:32:34 -05:00
Matthew Honnibal
d65f99a720
Improve model saving in train script
2017-05-26 05:52:09 -05:00
Matthew Honnibal
22d7b448a5
Fix convert command
2017-05-25 19:47:12 -05:00
Matthew Honnibal
df8015f05d
Tweaks to train script
2017-05-25 17:15:24 -05:00
Matthew Honnibal
702fe74a4d
Clean up spacy.cli.train
2017-05-25 16:16:30 -05:00
Matthew Honnibal
135a13790c
Disable gold preprocessing
2017-05-24 20:10:20 -05:00
Matthew Honnibal
3959d778ac
Revert "Revert "WIP on improving parser efficiency""
...
This reverts commit 532afef4a8
.
2017-05-23 03:06:53 -05:00
Matthew Honnibal
532afef4a8
Revert "WIP on improving parser efficiency"
...
This reverts commit bdaac7ab44
.
2017-05-23 03:05:25 -05:00
Matthew Honnibal
bdaac7ab44
WIP on improving parser efficiency
2017-05-23 02:59:31 -05:00
Matthew Honnibal
6e8dce2c05
Fix train command line args
2017-05-22 10:41:39 -05:00
Matthew Honnibal
ae8cf70dc1
Fix CLI train signature
2017-05-22 06:13:39 -05:00
ines
fc3ec733ea
Reduce complexity in CLI
...
Remove now redundant model command and move plac annotations to cli
files
2017-05-22 12:28:58 +02:00
Matthew Honnibal
bc2294d7f1
Add support for fiddly hyper-parameters to train func
2017-05-22 04:51:08 -05:00
Matthew Honnibal
4e0988605a
Pass through non-projective=True
2017-05-22 04:51:08 -05:00
Matthew Honnibal
e14533757b
Use averaged params for evaluation
2017-05-22 04:51:08 -05:00
Matthew Honnibal
5db89053aa
Merge docstrings
2017-05-21 13:46:23 -05:00
Matthew Honnibal
baf3ef0ddc
Remove import of removed train_config script
2017-05-21 09:07:34 -05:00
Matthew Honnibal
4c9202249d
Refactor training, to fix memory leak
2017-05-21 09:07:06 -05:00
ines
0c6c65aa3c
Improve messaging if model linking fails after download
2017-05-21 00:28:37 +02:00
ines
e39ad78267
Resolve model name properly in cli.info
...
Use util.resolve_model_path() to also allow package names and paths.
2017-05-20 12:24:40 +02:00
Matthew Honnibal
3376d4d6e8
Update the train script, fixing GPU memory leak
2017-05-19 18:15:50 -05:00
Matthew Honnibal
08766240c3
Add incomplete iob converter
2017-05-19 13:27:51 -05:00
Matthew Honnibal
09a877886b
WIP on iob converter
2017-05-19 13:24:39 -05:00
Matthew Honnibal
ca70b08661
Fix GPU training and evaluation
2017-05-18 08:30:33 -05:00
Matthew Honnibal
fc8d3a112c
Add util.env_opt support: Can set hyper params through environment variables.
2017-05-18 04:36:53 -05:00
Matthew Honnibal
55dab77de8
Add conversion rule for .conll
2017-05-17 13:13:48 +02:00
Matthew Honnibal
793430aa7a
Get spaCy train command working with neural network
...
* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab
2017-05-17 12:04:50 +02:00
Matthew Honnibal
3bf4a28d8d
Use tag in CoNLL converter, not POS
2017-05-17 12:04:33 +02:00
Matthew Honnibal
8cf097ca88
Redesign training to integrate NN components
...
* Obsolete .parser, .entity etc names in favour of .pipeline
* Components no longer create models on initialization
* Models created by loading method (from_disk(), from_bytes() etc), or
.begin_training()
* Add .predict(), .set_annotations() methods in components
* Pass state through pipeline, to allow components to share information
more flexibly.
2017-05-16 16:17:30 +02:00
Matthew Honnibal
5211645af3
Get data flowing through pipeline. Needs redesign
2017-05-16 11:21:59 +02:00
Matthew Honnibal
a9edb3aa1d
Improve integration of NN parser, to support unified training API
2017-05-15 21:53:27 +02:00
ines
9d85cda8e4
Fix models error message and use about.__docs_models__ (see #1051 )
2017-05-13 13:05:47 +02:00