Matthew Honnibal
648532d647
Don't assume blas methods are present
2018-03-16 02:48:20 +01:00
Matthew Honnibal
e101f10ef0
Fix header
2018-03-13 02:12:16 +01:00
Matthew Honnibal
d55620041b
Switch parser to gemm from thinc.openblas
2018-03-13 02:10:58 +01:00
Matthew Honnibal
4b72c38556
Fix dropout bug in beam parser
2018-03-10 23:16:40 +01:00
Matthew Honnibal
3d6487c734
Support dropout in beam parse
2018-03-10 22:41:55 +01:00
Matthew Honnibal
14f729c72a
Add subtok label to parser
2018-02-26 12:26:35 +01:00
Matthew Honnibal
7137ad8b0b
Make label filtering clearer for projectivisation
2018-02-26 12:02:01 +01:00
Matthew Honnibal
7b66ec896a
Revert "Revert "Improve parser oracle around sentence breaks.""
...
This reverts commit 36e481c584
.
2018-02-26 10:57:37 +01:00
Matthew Honnibal
36e481c584
Revert "Improve parser oracle around sentence breaks."
...
This reverts commit 50817dc9ad
.
2018-02-26 10:53:55 +01:00
Matthew Honnibal
50817dc9ad
Improve parser oracle around sentence breaks.
2018-02-22 19:22:26 +01:00
Matthew Honnibal
661873ee4c
Randomize the rebatch size in parser
2018-02-21 21:02:07 +01:00
Matthew Honnibal
a0ddb803fd
Make error when no label found more helpful
2018-02-21 16:00:59 +01:00
Matthew Honnibal
ea2fc5d45f
Improve length and freq cutoffs in parser
2018-02-21 16:00:38 +01:00
Matthew Honnibal
e5757d4bf0
Add labels property to parser
2018-02-21 16:00:00 +01:00
Matthew Honnibal
eff4ae809a
Fix nonproj label filter
2018-02-21 15:59:04 +01:00
Matthew Honnibal
e624405cda
Temporarily remove cutoff when filtering labels in nonproj
2018-02-21 13:53:40 +01:00
Matthew Honnibal
8f06903e09
Fix multitask objectives
2018-02-17 18:41:36 +01:00
Matthew Honnibal
d1246c95fb
Fix model loading when using multitask objectives
2018-02-17 18:11:36 +01:00
Matthew Honnibal
7d5c720fc3
Fix multitask objective when no pipeline provided
2018-02-15 23:50:21 +01:00
Matthew Honnibal
59b7cf9db8
Add get_beam_parse method in ArcEager, for Prodigy
2018-02-15 21:03:16 +01:00
Claudiu-Vlad Ursache
e28de12cbd
Ensure files opened in from_disk
are closed
...
Fixes [issue 1706](https://github.com/explosion/spaCy/issues/1706 ).
2018-02-13 20:49:43 +01:00
Matthew Honnibal
e361b4f82b
Fix #1929 : Incorrect NER when pre-set sentence boundaries.
2018-02-08 15:25:41 +01:00
Matthew Honnibal
f74a802d09
Test and fix #1919 : Error resuming training
2018-02-02 02:32:40 +01:00
Matthew Honnibal
85c942a6e3
Dont overwrite pretrained_dims setting from cfg. Fixes #1727
2018-01-23 19:10:49 +01:00
Matthew Honnibal
fe4748fc38
Merge pull request #1870 from avadhpatel/master
...
Model Load Performance Improvement by more than 5x
2018-01-22 00:05:15 +01:00
Avadh Patel
a517df55c8
Small fix
...
Signed-off-by: Avadh Patel <avadh4all@gmail.com>
2018-01-21 15:20:45 -06:00
Avadh Patel
5b5029890d
Merge branch 'perfTuning' into perfTuningMaster
...
Signed-off-by: Avadh Patel <avadh4all@gmail.com>
2018-01-21 15:20:00 -06:00
Matthew Honnibal
203d2ea830
Allow multitask objectives to be added to the parser and NER more easily
2018-01-21 19:37:02 +01:00
Avadh Patel
75903949da
Updated model building after suggestion from Matthew
...
Signed-off-by: Avadh Patel <avadh4all@gmail.com>
2018-01-18 06:51:57 -06:00
Avadh Patel
fe879da2a1
Do not train model if its going to be loaded from disk
...
This saves significant time in loading a model from disk.
Signed-off-by: Avadh Patel <avadh4all@gmail.com>
2018-01-17 06:16:07 -06:00
Avadh Patel
2146faffee
Do not train model if its going to be loaded from disk
...
This saves significant time in loading a model from disk.
Signed-off-by: Avadh Patel <avadh4all@gmail.com>
2018-01-17 06:04:22 -06:00
Matthew Honnibal
f29c3925ee
Fix more efficient nonproj
2017-11-23 12:48:00 +00:00
Matthew Honnibal
db5c714ad2
Improve efficiency of deprojectivization
2017-11-23 12:31:34 +00:00
Matthew Honnibal
d274d3a3b9
Let beam forward use minibatches
2017-11-15 00:51:42 +01:00
Matthew Honnibal
855872f872
Remove state hashing
2017-11-14 23:36:46 +01:00
Matthew Honnibal
2512ea9eeb
Fix memory leak in beam parser
2017-11-14 02:11:40 +01:00
Matthew Honnibal
ca73d0d8fe
Cleanup states after beam parsing, explicitly
2017-11-13 18:18:26 +01:00
Matthew Honnibal
63ef9a2e73
Remove __dealloc__ from ParserBeam
2017-11-13 18:18:08 +01:00
Matthew Honnibal
25859dbb48
Return optimizer from begin_training, creating if necessary
2017-11-06 14:26:49 +01:00
Matthew Honnibal
2b35bb76ad
Fix tensorizer on GPU
2017-11-05 15:34:40 +01:00
Matthew Honnibal
3ca16ddbd4
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-11-04 00:25:02 +01:00
Matthew Honnibal
98c29b7912
Add padding vector in parser, to make gradient more correct
2017-11-04 00:23:23 +01:00
Matthew Honnibal
13c8881d2f
Expose parser's tok2vec model component
2017-11-03 20:20:59 +01:00
Matthew Honnibal
7fea845374
Remove print statement
2017-11-03 14:04:51 +01:00
Matthew Honnibal
a5b05f85f0
Set Doc.tensor attribute in parser
2017-11-03 11:21:00 +01:00
Matthew Honnibal
7698903617
Fix GPU usage
2017-10-31 02:33:16 +01:00
Matthew Honnibal
a0c7dabb72
Fix bug in 8-token parser features
2017-10-28 23:01:35 +00:00
Matthew Honnibal
b713d10d97
Switch to 13 features in parser
2017-10-28 23:01:14 +00:00
Matthew Honnibal
5414e2f14b
Use missing features in parser
2017-10-28 16:45:54 +00:00
Matthew Honnibal
64e4ff7c4b
Merge 'tidy-up' changes into branch. Resolve conflicts
2017-10-28 13:16:06 +02:00
Explosion Bot
b22e42af7f
Merge changes to parser and _ml
2017-10-28 11:52:10 +02:00
ines
b4d226a3f1
Tidy up syntax
2017-10-27 19:45:57 +02:00
ines
9c89e2cdef
Remove unused syntax iterators (now in language data)
2017-10-27 18:09:53 +02:00
ines
e33b7e0b3c
Tidy up parser and ML
2017-10-27 14:39:30 +02:00
Matthew Honnibal
531142a933
Merge remote-tracking branch 'origin/develop' into feature/better-parser
2017-10-27 12:34:48 +00:00
Matthew Honnibal
75a637fa43
Remove redundant imports from _ml
2017-10-27 10:19:56 +00:00
Matthew Honnibal
bb25bdcd92
Adjust call to scatter_add for the new version
2017-10-27 01:16:55 +00:00
Matthew Honnibal
90d1d9b230
Remove obsolete parser code
2017-10-26 13:22:45 +02:00
Matthew Honnibal
33f8c58782
Remove obsolete parser.pyx
2017-10-26 12:42:05 +02:00
Matthew Honnibal
35977bdbb9
Update better-parser branch with develop
2017-10-26 00:55:53 +00:00
ines
18aae423fb
Remove import of non-existing function
2017-10-25 15:54:10 +02:00
ines
5117a7d24d
Fix whitespace
2017-10-25 15:54:02 +02:00
Matthew Honnibal
075e8118ea
Update from develop
2017-10-25 12:45:21 +02:00
Matthew Honnibal
dd5b2d8fa3
Check for out-of-memory when calling calloc. Closes #1446
2017-10-24 12:40:47 +02:00
Matthew Honnibal
e7556ff048
Fix non-maxout parser
2017-10-23 18:16:23 +02:00
Matthew Honnibal
f111b228e0
Fix re-parsing of previously parsed text
...
If a Doc object had been previously parsed, it was possible for
invalid parses to be added. There were two problems:
1) The parse was only being partially erased
2) The RightArc action was able to create a 1-cycle.
This patch fixes both errors, and avoids resetting the parse if one is
present. In theory this might allow a better parse to be predicted by
running the parser twice.
Closes #1253 .
2017-10-20 16:27:36 +02:00
Matthew Honnibal
1036798155
Make parser consistent if maxout==1
2017-10-20 16:24:16 +02:00
Matthew Honnibal
827cd8a883
Fix support of maxout pieces in parser
2017-10-20 03:07:17 +02:00
Matthew Honnibal
a8850b4282
Remove redundant PrecomputableMaxouts class
2017-10-19 20:27:34 +02:00
Matthew Honnibal
b00d0a2c97
Fix bias in parser
2017-10-19 18:42:11 +02:00
Matthew Honnibal
b54b4b8a97
Make parser_maxout_pieces hyper-param work
2017-10-19 13:45:18 +02:00
Matthew Honnibal
15e5a04a8d
Clean up more depth=0 conditional code
2017-10-19 01:48:43 +02:00
Matthew Honnibal
906c50ac59
Fix loop typing, that caused error on windows
2017-10-19 01:48:39 +02:00
Matthew Honnibal
960788aaa2
Eliminate dead code in parser, and raise errors for obsolete options
2017-10-19 00:42:34 +02:00
Matthew Honnibal
bbfd7d8d5d
Clean up parser multi-threading
2017-10-19 00:25:21 +02:00
Matthew Honnibal
f018f2030c
Try optimized parser forward loop
2017-10-18 21:48:00 +02:00
Matthew Honnibal
633a75c7e0
Break parser batches into sub-batches, sorted by length.
2017-10-18 21:45:01 +02:00
Matthew Honnibal
908f44c3fe
Disable history features by default
2017-10-12 14:56:11 +02:00
Matthew Honnibal
cecfcc7711
Set default hyper params back to 'slow' settings
2017-10-12 13:12:26 +02:00
Matthew Honnibal
807e109f2b
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-11 02:47:59 -05:00
Matthew Honnibal
6e552c9d83
Prune number of non-projective labels more aggressiely
2017-10-11 02:46:44 -05:00
Matthew Honnibal
188f620046
Improve parser defaults
2017-10-11 09:43:48 +02:00
Matthew Honnibal
3065f12ef2
Make add parser label work for hidden_depth=0
2017-10-10 22:57:31 +02:00
Matthew Honnibal
8265b90c83
Update parser defaults
2017-10-09 21:55:20 -05:00
Matthew Honnibal
09d61ada5e
Merge pull request #1396 from explosion/feature/pipeline-management
...
💫 Improve pipeline and factory management
2017-10-10 04:29:54 +02:00
Matthew Honnibal
d8a2506023
Merge pull request #1401 from explosion/feature/add-parser-action
...
💫 Allow labels to be added to pre-trained parser and NER modes
2017-10-09 04:57:51 +02:00
Matthew Honnibal
d43a83e37a
Allow parser.add_label for pretrained models
2017-10-09 03:35:40 +02:00
Matthew Honnibal
4cc84b0234
Prohibit Break when sent_start < 0
2017-10-09 00:02:45 +02:00
Matthew Honnibal
e938bce320
Adjust parsing transition system to allow preset sentence segments.
2017-10-08 23:53:34 +02:00
Matthew Honnibal
20309fb9db
Make history features default to zero
2017-10-08 20:32:14 +02:00
Matthew Honnibal
42b401d08b
Change default hidden depth to 1
2017-10-07 21:05:21 -05:00
Matthew Honnibal
92c5d78b42
Unhack NER.add_action
2017-10-07 19:02:40 +02:00
Matthew Honnibal
3d22ccf495
Update default hyper-parameters
2017-10-07 07:16:41 -05:00
Matthew Honnibal
0384f08218
Trigger nonproj.deprojectivize as a postprocess
2017-10-07 02:00:47 +02:00
Matthew Honnibal
8be46d766e
Remove print statement
2017-10-06 16:19:02 -05:00
Matthew Honnibal
8e731009fe
Fix parser config serialization
2017-10-06 13:50:52 -05:00
Matthew Honnibal
16ba6aa8a6
Fix parser config serialization
2017-10-06 13:17:31 -05:00
Matthew Honnibal
c66399d8ae
Fix depth definition with history features
2017-10-06 06:20:05 -05:00
Matthew Honnibal
5c750a9c2f
Reserve 0 for 'missing' in history features
2017-10-06 06:10:13 -05:00
Matthew Honnibal
21d11936fe
Fix significant train/test skew error in history feats
2017-10-06 06:08:50 -05:00