Matthew Honnibal
8716ffe57d
Serialize vocab last
2017-09-24 05:01:45 -05:00
Matthew Honnibal
5a7fd0fd36
Fix vector linkage
2017-09-22 20:11:52 -05:00
Matthew Honnibal
4348c479fc
Merge pre-trained vectors and noshare patches
2017-09-22 20:07:28 -05:00
Matthew Honnibal
7dc61b3f43
Whitespace
2017-09-22 20:00:50 -05:00
Matthew Honnibal
20193371f5
Don't share CNN, to reduce complexities
2017-09-21 14:59:48 +02:00
Matthew Honnibal
b832f89ff8
Add resume_training function
2017-09-20 19:15:20 -05:00
Matthew Honnibal
c858927271
Copy vectors to GPU on begin training
2017-09-18 18:04:16 -05:00
Matthew Honnibal
43210abacc
Resolve fine-tuning conflict
2017-09-17 05:30:04 -05:00
Matthew Honnibal
e37a50a436
Pass documents to tensorizer, not 'features'
2017-09-16 12:46:36 -05:00
Matthew Honnibal
70da88a3a7
Update comment on Language.begin_training
2017-09-14 16:18:30 +02:00
Matthew Honnibal
78a5f842e9
Fix update when update_shared=False
2017-08-20 15:58:34 -05:00
Matthew Honnibal
8875590081
Add optimizer in Language.update if sgd=None
2017-08-20 14:42:07 +02:00
Matthew Honnibal
a3c51a0355
Fix creation of pipeline
2017-08-19 21:58:57 +02:00
Matthew Honnibal
97aabafb5f
Document as_tuples keyword arg of Language.pipe
2017-08-19 12:21:33 +02:00
Matthew Honnibal
11c31d285c
Restore changes from nn-beam-parser
2017-08-18 22:26:12 +02:00
Matthew Honnibal
52c180ecf5
Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
...
This reverts commit ea8de11ad5
, reversing
changes made to 08e443e083
.
2017-08-14 13:00:23 +02:00
Matthew Honnibal
4363b4aa4a
Fix redundant tokvecs updates during update
2017-08-13 12:36:55 +02:00
Matthew Honnibal
0acce0521b
Fix Language.update for pipeline
2017-08-06 14:13:03 +02:00
Matthew Honnibal
0eec7c9e9b
Fix Language.evaluate
2017-08-06 02:18:31 +02:00
Matthew Honnibal
cc19ea0e7c
Add update_tensors flag to Language.update. Experimental, re #1182
2017-08-06 02:17:10 +02:00
Matthew Honnibal
2e00361522
Fix update when 0 docs
2017-08-01 22:10:17 +02:00
Matthew Honnibal
523b0df2c9
Update text classification model
2017-07-25 18:57:59 +02:00
Matthew Honnibal
d8aa721664
Compute Language.meta with a property
2017-07-23 00:50:18 +02:00
Matthew Honnibal
baa3d81c35
Add text categorizer to Language
2017-07-22 01:13:36 +02:00
Matthew Honnibal
836bfa2d0f
Add factory for experimental SimilarityHook component
2017-06-05 15:40:22 +02:00
Matthew Honnibal
2479cde446
Support disable keyword in Language.__init__
2017-06-05 13:13:07 +02:00
Matthew Honnibal
8f8f90b46b
Disable labeller if not parsing
2017-06-04 20:18:54 -05:00
Matthew Honnibal
939e8ed567
Add lookup properties for components in Language
2017-06-04 15:52:09 -05:00
Matthew Honnibal
92ae36f84e
Improve way noun chunks iterator is looked up
2017-06-04 21:53:39 +02:00
Matthew Honnibal
21eef90dbc
Support specifying which GPU
2017-06-03 16:10:23 -05:00
Matthew Honnibal
fea1144e6d
Set max batch size in evaluate
2017-06-03 13:31:33 -05:00
ines
a3e4f91f4a
Only load vocab if it exists
2017-06-01 14:38:35 +02:00
Matthew Honnibal
33e5ec737f
Fix to/from disk methods
2017-05-31 13:43:10 +02:00
Matthew Honnibal
1e6df0a2a1
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-05-29 14:30:12 -05:00
ines
6145fe6a93
Catch all kwargs on Language
2017-05-29 20:43:48 +02:00
Matthew Honnibal
9c9ee24411
Fix broken lambda scoping in Python 2
2017-05-29 13:23:28 -05:00
Matthew Honnibal
aa4c33914b
Work on serialization
2017-05-29 08:40:45 -05:00
Matthew Honnibal
7b06bb896e
Fix for serialization
2017-05-29 13:42:55 +02:00
Matthew Honnibal
74235587ef
Fix to serialization
2017-05-29 13:40:31 +02:00
Matthew Honnibal
59f355d525
Fixes for serialization
2017-05-29 13:38:20 +02:00
Matthew Honnibal
ff26aa6c37
Work on to/from bytes/disk serialization methods
2017-05-29 11:45:45 +02:00
Matthew Honnibal
8a24c60c1e
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-05-28 08:12:05 -05:00
Matthew Honnibal
bc97bc292c
Fix __call__ method
2017-05-28 08:11:58 -05:00
Matthew Honnibal
b082f76494
Randomize pipeline order during training
2017-05-27 18:32:21 -05:00
Matthew Honnibal
73a643d32a
Don't randomise pipeline for training, and don't update if no gradient
2017-05-27 08:20:13 -05:00
Matthew Honnibal
8af3100143
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-05-26 11:31:41 -05:00
ines
353f0ef8d7
Use disable argument (list) for serialization
2017-05-26 12:33:54 +02:00
Matthew Honnibal
dbf2a4cf57
Update all models on each epoch
2017-05-25 19:46:56 -05:00
Matthew Honnibal
82b11b0320
Remove print statement
2017-05-25 17:15:59 -05:00
Matthew Honnibal
f403c2cd5f
Add env opts for optimizer
2017-05-25 11:19:26 -05:00
Matthew Honnibal
8500d9b1da
Only train one task per iter, holding grads
2017-05-25 06:47:42 -05:00
Matthew Honnibal
e6cc927ab1
Rearrange multi-task learning
2017-05-24 20:10:54 -05:00
Matthew Honnibal
9adfe9e8fc
Don't hold gradient updates in language -- let the parser decide how to batch the updates.
2017-05-23 04:29:10 -05:00
Matthew Honnibal
3959d778ac
Revert "Revert "WIP on improving parser efficiency""
...
This reverts commit 532afef4a8
.
2017-05-23 03:06:53 -05:00
Matthew Honnibal
532afef4a8
Revert "WIP on improving parser efficiency"
...
This reverts commit bdaac7ab44
.
2017-05-23 03:05:25 -05:00
Matthew Honnibal
bdaac7ab44
WIP on improving parser efficiency
2017-05-23 02:59:31 -05:00
ines
54f04a9fe0
Update API docs with changes in spacy.gold and spacy.language
2017-05-22 12:29:30 +02:00
Matthew Honnibal
9262fc4829
Fix syntax error
2017-05-22 05:14:59 -05:00
Matthew Honnibal
2a5eb9f61e
Make nonproj methods top-level functions, instead of class methods
2017-05-22 04:51:08 -05:00
Matthew Honnibal
5738d373d5
Add deprojectivize to pipeline
2017-05-22 04:51:08 -05:00
Matthew Honnibal
8d1e64be69
Add experimental NeuralLabeller
2017-05-22 04:51:08 -05:00
Matthew Honnibal
5db89053aa
Merge docstrings
2017-05-21 13:46:23 -05:00
Matthew Honnibal
432b3499b3
Fix memory leak
2017-05-21 13:38:46 -05:00
Matthew Honnibal
4c9202249d
Refactor training, to fix memory leak
2017-05-21 09:07:06 -05:00
ines
d82ae9a585
Change "function" to "callable" in docs
2017-05-21 13:17:40 +02:00
Matthew Honnibal
3b7c108246
Pass tokvecs through as a list, instead of concatenated. Also fix padding
2017-05-20 13:23:32 -05:00
Matthew Honnibal
66ea9aebe7
Remove the state argument from Language
2017-05-19 13:25:42 -05:00
ines
2c8c9dc0c9
Update docstrings and API docs for Language
2017-05-19 18:47:24 +02:00
ines
d42bc16868
Update docstrings and API docs for Language class
2017-05-18 23:57:38 +02:00
Matthew Honnibal
c2c825127a
Fix use_params and pipe methods
2017-05-18 08:30:59 -05:00
Matthew Honnibal
2713041571
Fix GPU usage in Language
2017-05-18 04:25:19 -05:00
Matthew Honnibal
793430aa7a
Get spaCy train command working with neural network
...
* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab
2017-05-17 12:04:50 +02:00
Matthew Honnibal
8cf097ca88
Redesign training to integrate NN components
...
* Obsolete .parser, .entity etc names in favour of .pipeline
* Components no longer create models on initialization
* Models created by loading method (from_disk(), from_bytes() etc), or
.begin_training()
* Add .predict(), .set_annotations() methods in components
* Pass state through pipeline, to allow components to share information
more flexibly.
2017-05-16 16:17:30 +02:00
Matthew Honnibal
5211645af3
Get data flowing through pipeline. Needs redesign
2017-05-16 11:21:59 +02:00
Matthew Honnibal
a9edb3aa1d
Improve integration of NN parser, to support unified training API
2017-05-15 21:53:27 +02:00
Matthew Honnibal
9e167b7bb6
Strip serializer from code
2017-05-09 17:28:50 +02:00
ines
ea5fa46475
Import LEX_ATTRS from lang.lex_attrs
2017-05-09 00:58:10 +02:00
ines
6eb6306843
Fix language data imports
2017-05-08 23:58:31 +02:00
Matthew Honnibal
d0e19267e8
Create directory if missing in save_to_directory
2017-04-23 21:24:43 +02:00
Matthew Honnibal
4d2a659c52
Fix json dump for Python3
2017-04-23 17:05:53 +02:00
ines
ddd5194088
Update Language docs and docstrings
2017-04-17 01:52:13 +02:00
ines
f62b740961
Use compat.json_dumps
2017-04-17 01:46:14 +02:00
ines
8e83f8e2fa
Update docstrings
2017-04-17 01:40:26 +02:00
ines
e2299dc389
Ensure path in save_to_directory
2017-04-17 01:40:14 +02:00
Matthew Honnibal
4efd6fb9d6
Fix training
2017-04-16 15:28:27 -05:00
Matthew Honnibal
89a4f262fc
Fix training methods
2017-04-16 13:00:37 -05:00
ines
c05ec4b89a
Add compat functions and remove old workarounds
...
Add ensure_path util function to handle checking instance of path
2017-04-15 12:11:16 +02:00
ines
d24589aa72
Clean up imports, unused code, whitespace, docstrings
2017-04-15 12:05:47 +02:00
ines
561f2a3eb4
Use consistent formatting for docstrings
2017-04-15 11:59:21 +02:00
Matthew Honnibal
33ba5066eb
Refactor Language.end_training, making new save_to_directory method
2017-04-14 23:51:24 +02:00
oeg
010293fb2f
fix(typo): Fixes typo in method calling PseudoProjectivity.deprojectivize, failing with new train cli
2017-04-06 17:33:15 +02:00
Matthew Honnibal
47a3ef06a6
Unhack deprojetivization, moving it into pipeline
...
Previously the deprojectivize() call was attached to the transition
system, and only called for German. Instead it should be a separate
process, called after the parser. This makes it available for any
language. Closes #898 .
2017-03-31 12:31:50 +02:00
Matthew Honnibal
83ba6c247c
Fix init of Language without model
2017-03-26 16:46:00 +02:00
Raphaël Bournhonesque
f332bf05be
Remove unused import statements
2017-03-21 21:08:54 +01:00
ines
9605cf39cc
Handle default path in Language classes
2017-03-18 12:58:45 +01:00
Matthew Honnibal
8843b84bd1
Merge remote-tracking branch 'origin/develop-downloads'
2017-03-16 12:00:42 -05:00
ines
618ce3b425
Add .meta to Language object
...
Allows getting the current model's meta data, e.g.:
nlp = spacy.load('my-model')
print(nlp.meta)
2017-03-16 17:14:56 +01:00
Matthew Honnibal
b382dc902c
Add morph rules in Language
2017-03-15 09:24:40 -05:00
Matthew Honnibal
f70be44746
Use lemmatizer in code, not from downloaded model.
2017-03-15 04:52:50 -05:00
Matthew Honnibal
f71eeef9bb
Pass path argument to end_training
2017-03-09 18:42:40 -06:00