Commit Graph

147 Commits

Author SHA1 Message Date
Matthew Honnibal
8d1e64be69 Add experimental NeuralLabeller 2017-05-22 04:51:08 -05:00
Matthew Honnibal
5db89053aa Merge docstrings 2017-05-21 13:46:23 -05:00
Matthew Honnibal
432b3499b3 Fix memory leak 2017-05-21 13:38:46 -05:00
Matthew Honnibal
4c9202249d Refactor training, to fix memory leak 2017-05-21 09:07:06 -05:00
ines
d82ae9a585 Change "function" to "callable" in docs 2017-05-21 13:17:40 +02:00
Matthew Honnibal
3b7c108246 Pass tokvecs through as a list, instead of concatenated. Also fix padding 2017-05-20 13:23:32 -05:00
Matthew Honnibal
66ea9aebe7 Remove the state argument from Language 2017-05-19 13:25:42 -05:00
ines
2c8c9dc0c9 Update docstrings and API docs for Language 2017-05-19 18:47:24 +02:00
ines
d42bc16868 Update docstrings and API docs for Language class 2017-05-18 23:57:38 +02:00
Matthew Honnibal
c2c825127a Fix use_params and pipe methods 2017-05-18 08:30:59 -05:00
Matthew Honnibal
2713041571 Fix GPU usage in Language 2017-05-18 04:25:19 -05:00
Matthew Honnibal
793430aa7a Get spaCy train command working with neural network
* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab
2017-05-17 12:04:50 +02:00
Matthew Honnibal
8cf097ca88 Redesign training to integrate NN components
* Obsolete .parser, .entity etc names in favour of .pipeline
* Components no longer create models on initialization
* Models created by loading method (from_disk(), from_bytes() etc), or
    .begin_training()
* Add .predict(), .set_annotations() methods in components
* Pass state through pipeline, to allow components to share information
    more flexibly.
2017-05-16 16:17:30 +02:00
Matthew Honnibal
5211645af3 Get data flowing through pipeline. Needs redesign 2017-05-16 11:21:59 +02:00
Matthew Honnibal
a9edb3aa1d Improve integration of NN parser, to support unified training API 2017-05-15 21:53:27 +02:00
Matthew Honnibal
9e167b7bb6 Strip serializer from code 2017-05-09 17:28:50 +02:00
ines
ea5fa46475 Import LEX_ATTRS from lang.lex_attrs 2017-05-09 00:58:10 +02:00
ines
6eb6306843 Fix language data imports 2017-05-08 23:58:31 +02:00
Matthew Honnibal
d0e19267e8 Create directory if missing in save_to_directory 2017-04-23 21:24:43 +02:00
Matthew Honnibal
4d2a659c52 Fix json dump for Python3 2017-04-23 17:05:53 +02:00
ines
ddd5194088 Update Language docs and docstrings 2017-04-17 01:52:13 +02:00
ines
f62b740961 Use compat.json_dumps 2017-04-17 01:46:14 +02:00
ines
8e83f8e2fa Update docstrings 2017-04-17 01:40:26 +02:00
ines
e2299dc389 Ensure path in save_to_directory 2017-04-17 01:40:14 +02:00
Matthew Honnibal
4efd6fb9d6 Fix training 2017-04-16 15:28:27 -05:00
Matthew Honnibal
89a4f262fc Fix training methods 2017-04-16 13:00:37 -05:00
ines
c05ec4b89a Add compat functions and remove old workarounds
Add ensure_path util function to handle checking instance of path
2017-04-15 12:11:16 +02:00
ines
d24589aa72 Clean up imports, unused code, whitespace, docstrings 2017-04-15 12:05:47 +02:00
ines
561f2a3eb4 Use consistent formatting for docstrings 2017-04-15 11:59:21 +02:00
Matthew Honnibal
33ba5066eb Refactor Language.end_training, making new save_to_directory method 2017-04-14 23:51:24 +02:00
oeg
010293fb2f fix(typo): Fixes typo in method calling PseudoProjectivity.deprojectivize, failing with new train cli 2017-04-06 17:33:15 +02:00
Matthew Honnibal
47a3ef06a6 Unhack deprojetivization, moving it into pipeline
Previously the deprojectivize() call was attached to the transition
system, and only called for German. Instead it should be a separate
process, called after the parser. This makes it available for any
language. Closes #898.
2017-03-31 12:31:50 +02:00
Matthew Honnibal
83ba6c247c Fix init of Language without model 2017-03-26 16:46:00 +02:00
Raphaël Bournhonesque
f332bf05be Remove unused import statements 2017-03-21 21:08:54 +01:00
ines
9605cf39cc Handle default path in Language classes 2017-03-18 12:58:45 +01:00
Matthew Honnibal
8843b84bd1 Merge remote-tracking branch 'origin/develop-downloads' 2017-03-16 12:00:42 -05:00
ines
618ce3b425 Add .meta to Language object
Allows getting the current model's meta data, e.g.:
nlp = spacy.load('my-model')
print(nlp.meta)
2017-03-16 17:14:56 +01:00
Matthew Honnibal
b382dc902c Add morph rules in Language 2017-03-15 09:24:40 -05:00
Matthew Honnibal
f70be44746 Use lemmatizer in code, not from downloaded model. 2017-03-15 04:52:50 -05:00
Matthew Honnibal
f71eeef9bb Pass path argument to end_training 2017-03-09 18:42:40 -06:00
Matthew Honnibal
cd33b39a04 Fix 2/3 problem for json save/load 2017-03-08 01:39:13 +01:00
Ines Montani
aa876884f0 Revert "Revert "Merge remote-tracking branch 'origin/master'""
This reverts commit fb9d3bb022.
2017-01-09 13:28:13 +01:00
Matthew Honnibal
3679fb43a3 Fix loading of lemmatizer 2016-12-18 17:34:09 +01:00
Ines Montani
b11d8cd3db Merge remote-tracking branch 'origin/organize-language-data' into organize-language-data 2016-12-18 16:57:12 +01:00
Ines Montani
753068f1d5 Use base language data as default 2016-12-18 16:55:25 +01:00
Ines Montani
bcc1d50d09 Remove trailing whitespace 2016-12-18 16:54:52 +01:00
Matthew Honnibal
44f4f008bd Wire up lemmatizer rules for English 2016-12-18 15:50:09 +01:00
Matthew Honnibal
296d33a4fc Merge branch 'master' of ssh://github.com/explosion/spaCy 2016-11-26 12:36:18 +01:00
Matthew Honnibal
1f6c37c6f5 Fix create_tokenizer when nlp is None 2016-11-26 12:36:04 +01:00
Matthew Honnibal
c7889492f9 Fix model saving error for Python 3 2016-11-25 18:04:30 -06:00