Commit Graph

55 Commits

Author SHA1 Message Date
Matthew Honnibal
e93d43a43a Fix training with preset vectors 2017-09-22 20:00:40 -05:00
Matthew Honnibal
0a9016cade Fix serialization during training 2017-09-21 13:06:45 -05:00
Matthew Honnibal
1d73dec8b1 Refactor train script 2017-09-20 19:17:10 -05:00
Matthew Honnibal
a0c4b33d03 Support resuming a model during spacy train 2017-09-18 18:04:47 -05:00
Matthew Honnibal
8496d76224 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-09-14 09:21:20 -05:00
Matthew Honnibal
24ff6b0ad9 Fix parsing and tok2vec models 2017-09-06 05:50:58 -05:00
Matthew Honnibal
e920885676 Fix pickle during train 2017-09-02 12:46:01 -05:00
Matthew Honnibal
7a6edeea68 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-08-20 12:55:39 -05:00
Matthew Honnibal
f2f9229964 Fix name of update_shared flag 2017-08-20 18:19:06 +02:00
Matthew Honnibal
84bb543e4d Add gold_preproc flag to cli/train 2017-08-20 11:07:00 -05:00
Matthew Honnibal
11c31d285c Restore changes from nn-beam-parser 2017-08-18 22:26:12 +02:00
Matthew Honnibal
52c180ecf5 Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
This reverts commit ea8de11ad5, reversing
changes made to 08e443e083.
2017-08-14 13:00:23 +02:00
Matthew Honnibal
8870d491f1 Remove redundant pickling during training 2017-08-12 08:55:53 -05:00
Matthew Honnibal
0a566dc320 Add update_tensors flag to Language.update. Experimental, re #1182 2017-08-06 02:18:12 +02:00
Matthew Honnibal
c52fde40f4 Improve train CLI 2017-06-04 20:18:37 -05:00
Matthew Honnibal
21eef90dbc Support specifying which GPU 2017-06-03 16:10:23 -05:00
Matthew Honnibal
43353b5413 Improve train CLI script 2017-06-03 13:28:20 -05:00
Matthew Honnibal
8a693c2605 Write binary file during training 2017-05-31 02:59:18 +02:00
Matthew Honnibal
49235017bf Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-05-27 16:34:28 -05:00
Matthew Honnibal
5e4312feed Evaluate loaded class, to ensure save/load works 2017-05-27 15:47:02 -05:00
ines
086a06e7d7 Fix CLI docstrings and add command as first argument
Workaround for Plac
2017-05-27 20:01:46 +02:00
Matthew Honnibal
de13fe0305 Remove length cap on sentences 2017-05-27 08:20:32 -05:00
Matthew Honnibal
d65f99a720 Improve model saving in train script 2017-05-26 05:52:09 -05:00
Matthew Honnibal
df8015f05d Tweaks to train script 2017-05-25 17:15:24 -05:00
Matthew Honnibal
702fe74a4d Clean up spacy.cli.train 2017-05-25 16:16:30 -05:00
Matthew Honnibal
135a13790c Disable gold preprocessing 2017-05-24 20:10:20 -05:00
Matthew Honnibal
3959d778ac Revert "Revert "WIP on improving parser efficiency""
This reverts commit 532afef4a8.
2017-05-23 03:06:53 -05:00
Matthew Honnibal
532afef4a8 Revert "WIP on improving parser efficiency"
This reverts commit bdaac7ab44.
2017-05-23 03:05:25 -05:00
Matthew Honnibal
bdaac7ab44 WIP on improving parser efficiency 2017-05-23 02:59:31 -05:00
Matthew Honnibal
6e8dce2c05 Fix train command line args 2017-05-22 10:41:39 -05:00
Matthew Honnibal
ae8cf70dc1 Fix CLI train signature 2017-05-22 06:13:39 -05:00
ines
fc3ec733ea Reduce complexity in CLI
Remove now redundant model command and move plac annotations to cli
files
2017-05-22 12:28:58 +02:00
Matthew Honnibal
bc2294d7f1 Add support for fiddly hyper-parameters to train func 2017-05-22 04:51:08 -05:00
Matthew Honnibal
4e0988605a Pass through non-projective=True 2017-05-22 04:51:08 -05:00
Matthew Honnibal
e14533757b Use averaged params for evaluation 2017-05-22 04:51:08 -05:00
Matthew Honnibal
4c9202249d Refactor training, to fix memory leak 2017-05-21 09:07:06 -05:00
Matthew Honnibal
3376d4d6e8 Update the train script, fixing GPU memory leak 2017-05-19 18:15:50 -05:00
Matthew Honnibal
ca70b08661 Fix GPU training and evaluation 2017-05-18 08:30:33 -05:00
Matthew Honnibal
fc8d3a112c Add util.env_opt support: Can set hyper params through environment variables. 2017-05-18 04:36:53 -05:00
Matthew Honnibal
793430aa7a Get spaCy train command working with neural network
* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab
2017-05-17 12:04:50 +02:00
Matthew Honnibal
8cf097ca88 Redesign training to integrate NN components
* Obsolete .parser, .entity etc names in favour of .pipeline
* Components no longer create models on initialization
* Models created by loading method (from_disk(), from_bytes() etc), or
    .begin_training()
* Add .predict(), .set_annotations() methods in components
* Pass state through pipeline, to allow components to share information
    more flexibly.
2017-05-16 16:17:30 +02:00
Matthew Honnibal
5211645af3 Get data flowing through pipeline. Needs redesign 2017-05-16 11:21:59 +02:00
Matthew Honnibal
a9edb3aa1d Improve integration of NN parser, to support unified training API 2017-05-15 21:53:27 +02:00
ines
59c3b9d4dd Tidy up CLI and fix print functions 2017-05-07 23:25:29 +02:00
Matthew Honnibal
4f9657b42b Fix reporting if no dev data with train 2017-04-23 22:27:10 +02:00
ines
3a9710f356 Pass dev_scores to print_progress correctly (resolves #1008)
Only read scores attribute if command is used with dev_data, otherwise
default dev_scores to empty dict.
2017-04-23 15:58:40 +02:00
Matthew Honnibal
89a4f262fc Fix training methods 2017-04-16 13:00:37 -05:00
ines
d24589aa72 Clean up imports, unused code, whitespace, docstrings 2017-04-15 12:05:47 +02:00
ines
9952d3b08a Fix whitespace 2017-04-07 13:02:05 +02:00
Matthew Honnibal
2efdbc08ff Make training work with directories 2017-03-26 08:46:44 -05:00