Matthew Honnibal
|
943af4423a
|
Make depth setting in parser work again
|
2017-10-04 20:06:05 -05:00 |
|
Matthew Honnibal
|
246612cb53
|
Merge remote-tracking branch 'origin/develop' into feature/parser-history-model
|
2017-10-03 16:56:42 -05:00 |
|
Matthew Honnibal
|
5454b20cd7
|
Update thinc imports for 6.9
|
2017-10-03 20:07:17 +02:00 |
|
Matthew Honnibal
|
4a59f6358c
|
Fix thinc imports
|
2017-10-03 19:21:26 +02:00 |
|
Matthew Honnibal
|
dc3c791947
|
Fix history size option
|
2017-10-03 13:41:23 +02:00 |
|
Matthew Honnibal
|
278a4c17c6
|
Fix history features
|
2017-10-03 13:27:10 +02:00 |
|
Matthew Honnibal
|
b50a359e11
|
Add support for history features in parsing models
|
2017-10-03 12:44:01 +02:00 |
|
Matthew Honnibal
|
ee41e4fea7
|
Support history features in stateclass
|
2017-10-03 12:43:48 +02:00 |
|
Matthew Honnibal
|
cdb2d83e16
|
Pass dropout in parser
|
2017-09-28 18:47:13 -05:00 |
|
Matthew Honnibal
|
158e177cae
|
Fix default embed size
|
2017-09-28 08:25:23 -05:00 |
|
Matthew Honnibal
|
1a37a2c0a0
|
Update training defaults
|
2017-09-27 11:48:07 -05:00 |
|
Matthew Honnibal
|
3274b46a0d
|
Try to fix compile error on Windows
|
2017-09-26 09:05:53 -05:00 |
|
Matthew Honnibal
|
5056743ad5
|
Fix parser serialization
|
2017-09-26 06:44:56 -05:00 |
|
Matthew Honnibal
|
bf917225ab
|
Allow multi-task objectives during training
|
2017-09-26 05:42:52 -05:00 |
|
Matthew Honnibal
|
4348c479fc
|
Merge pre-trained vectors and noshare patches
|
2017-09-22 20:07:28 -05:00 |
|
Matthew Honnibal
|
0795857dcb
|
Fix beam parsing
|
2017-09-23 02:59:53 +02:00 |
|
Matthew Honnibal
|
d9124f1aa3
|
Add link_vectors_to_models function
|
2017-09-22 09:38:22 -05:00 |
|
Matthew Honnibal
|
20193371f5
|
Don't share CNN, to reduce complexities
|
2017-09-21 14:59:48 +02:00 |
|
Matthew Honnibal
|
24e85c2048
|
Pass values for CNN maxout pieces option
|
2017-09-20 19:16:12 -05:00 |
|
Matthew Honnibal
|
2489dcaccf
|
Fix serialization of parser
|
2017-09-19 23:42:12 +02:00 |
|
Matthew Honnibal
|
2b0efc77ae
|
Fix wiring of pre-trained vectors in parser loading
|
2017-09-17 05:47:34 -05:00 |
|
Matthew Honnibal
|
31c2e91c35
|
Fix wiring of pre-trained vectors in parser loading
|
2017-09-17 05:46:55 -05:00 |
|
Matthew Honnibal
|
c003c561c3
|
Revert NER action loading change, for model compatibility
|
2017-09-17 05:46:03 -05:00 |
|
Matthew Honnibal
|
43210abacc
|
Resolve fine-tuning conflict
|
2017-09-17 05:30:04 -05:00 |
|
Matthew Honnibal
|
5ff2491f24
|
Pass option for pre-trained vectors in parser
|
2017-09-16 12:47:21 -05:00 |
|
Matthew Honnibal
|
8665a77f48
|
Fix feature error in NER
|
2017-09-16 12:46:57 -05:00 |
|
Matthew Honnibal
|
f730d07e4e
|
Fix prange error for Windows
|
2017-09-16 00:25:33 +02:00 |
|
Matthew Honnibal
|
8b481e0465
|
Remove redundant brackets
|
2017-09-15 10:38:08 +02:00 |
|
Matthew Honnibal
|
8c503487af
|
Fix lookup of missing NER actions
|
2017-09-14 16:59:45 +02:00 |
|
Matthew Honnibal
|
664c5af745
|
Revert padding in parser
|
2017-09-14 16:59:25 +02:00 |
|
Matthew Honnibal
|
c6395b057a
|
Improve parser feature extraction, for missing values
|
2017-09-14 16:18:02 +02:00 |
|
Matthew Honnibal
|
daf869ab3b
|
Fix add_action for NER, so labelled 'O' actions aren't added
|
2017-09-14 16:16:41 +02:00 |
|
Matthew Honnibal
|
dd9cab0faf
|
Fix type-check for int/long
|
2017-09-06 19:03:05 +02:00 |
|
Matthew Honnibal
|
dcbf866970
|
Merge parser changes
|
2017-09-06 18:41:05 +02:00 |
|
Matthew Honnibal
|
24ff6b0ad9
|
Fix parsing and tok2vec models
|
2017-09-06 05:50:58 -05:00 |
|
Matthew Honnibal
|
33fa91feb7
|
Restore correctness of parser model
|
2017-09-04 21:19:30 +02:00 |
|
Matthew Honnibal
|
9d65d67985
|
Preserve model compatibility in parser, for now
|
2017-09-04 16:46:22 +02:00 |
|
Matthew Honnibal
|
789e1a3980
|
Use 13 parser features, not 8
|
2017-08-31 14:13:00 -05:00 |
|
Matthew Honnibal
|
4ceebde523
|
Fix gradient bug in parser
|
2017-08-30 17:32:56 -05:00 |
|
Matthew Honnibal
|
44589fb38c
|
Fix Break oracle
|
2017-08-25 19:50:55 -05:00 |
|
Matthew Honnibal
|
20dd66ddc2
|
Constrain sentence boundaries to IS_PUNCT and IS_SPACE tokens
|
2017-08-25 19:35:47 +02:00 |
|
Matthew Honnibal
|
682346dd66
|
Restore optimized hidden_depth=0 for parser
|
2017-08-21 19:18:04 -05:00 |
|
Matthew Honnibal
|
62878e50db
|
Fix misalignment caued by filtering inputs at wrong point in parser
|
2017-08-20 15:59:28 -05:00 |
|
Matthew Honnibal
|
84b7ed49e4
|
Ensure updates aren't made if no gold available
|
2017-08-20 14:41:38 +02:00 |
|
Matthew Honnibal
|
ab28f911b4
|
Fix parser learning rates
|
2017-08-19 09:02:57 -05:00 |
|
Matthew Honnibal
|
c307a0ffb8
|
Restore patches from nn-beam-parser to spacy/syntax
|
2017-08-18 22:38:59 +02:00 |
|
Matthew Honnibal
|
5f81d700ff
|
Restore patches from nn-beam-parser to spacy/syntax
|
2017-08-18 22:23:03 +02:00 |
|
Matthew Honnibal
|
d456d2efe1
|
Fix conflicts in nn_parser
|
2017-08-18 20:55:58 +02:00 |
|
Matthew Honnibal
|
1cec1efca7
|
Fix merge conflicts in nn_parser from beam stuff
|
2017-08-18 20:50:49 +02:00 |
|
Matthew Honnibal
|
426f84937f
|
Resolve conflicts when merging new beam parsing stuff
|
2017-08-18 13:38:32 -05:00 |
|
Matthew Honnibal
|
f75420ae79
|
Unhack beam parsing, moving it under options instead of global flags
|
2017-08-18 13:31:15 -05:00 |
|
Matthew Honnibal
|
0209a06b4e
|
Update beam parser
|
2017-08-16 18:25:49 -05:00 |
|
Matthew Honnibal
|
a6d8d7c82e
|
Add is_gold_parse method to transition system
|
2017-08-16 18:24:09 -05:00 |
|
Matthew Honnibal
|
3533bb61cb
|
Add option of 8 feature parse state
|
2017-08-16 18:23:27 -05:00 |
|
Matthew Honnibal
|
210f6d5175
|
Fix efficiency error in batch parse
|
2017-08-15 03:19:03 -05:00 |
|
Matthew Honnibal
|
23537a011d
|
Tweaks to beam parser
|
2017-08-15 03:15:28 -05:00 |
|
Matthew Honnibal
|
500e92553d
|
Fix memory error when copying scores in beam
|
2017-08-15 03:15:04 -05:00 |
|
Matthew Honnibal
|
a8e4064dd8
|
Fix tensor gradient in parser
|
2017-08-15 03:14:36 -05:00 |
|
Matthew Honnibal
|
e420e0366c
|
Remove use of hash function in beam parser
|
2017-08-15 03:13:57 -05:00 |
|
Matthew Honnibal
|
52c180ecf5
|
Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
This reverts commit ea8de11ad5 , reversing
changes made to 08e443e083 .
|
2017-08-14 13:00:23 +02:00 |
|
Matthew Honnibal
|
0ae045256d
|
Fix beam training
|
2017-08-13 18:02:05 -05:00 |
|
Matthew Honnibal
|
6a42cc16ff
|
Fix beam parser, improve efficiency of non-beam
|
2017-08-13 12:37:26 +02:00 |
|
Matthew Honnibal
|
12de263813
|
Bug fixes to beam parsing. Learns small sample
|
2017-08-13 09:33:39 +02:00 |
|
Matthew Honnibal
|
17874fe491
|
Disable beam parsing
|
2017-08-12 19:35:40 -05:00 |
|
Matthew Honnibal
|
3e30712b62
|
Improve defaults
|
2017-08-12 19:24:17 -05:00 |
|
Matthew Honnibal
|
28e930aae0
|
Fixes for beam parsing. Not working
|
2017-08-12 19:22:52 -05:00 |
|
Matthew Honnibal
|
c96d769836
|
Fix beam parse. Not sure if working
|
2017-08-12 18:21:54 -05:00 |
|
Matthew Honnibal
|
4638f4b869
|
Fix beam update
|
2017-08-12 17:15:16 -05:00 |
|
Matthew Honnibal
|
d4308d2363
|
Initialize State offset to 0
|
2017-08-12 17:14:39 -05:00 |
|
Matthew Honnibal
|
b353e4d843
|
Work on parser beam training
|
2017-08-12 14:47:45 -05:00 |
|
Matthew Honnibal
|
cd5ecedf6a
|
Try drop_layer in parser
|
2017-08-12 08:56:33 -05:00 |
|
Matthew Honnibal
|
1a59db1c86
|
Fix dropout and learn rate in parser
|
2017-08-12 05:44:39 -05:00 |
|
Matthew Honnibal
|
d01dc3704a
|
Adjust parser model
|
2017-08-09 20:06:33 -05:00 |
|
Matthew Honnibal
|
f37528ef58
|
Pass embed size for parser fine-tune. Use SELU
|
2017-08-09 17:52:53 -05:00 |
|
Matthew Honnibal
|
bbace204be
|
Gate parser fine-tuning behind feature flag
|
2017-08-09 16:40:42 -05:00 |
|
Matthew Honnibal
|
dbdd8afc4b
|
Fix parser fine-tune training
|
2017-08-08 15:46:07 -05:00 |
|
Matthew Honnibal
|
88bf1cf87c
|
Update parser for fine tuning
|
2017-08-08 15:34:17 -05:00 |
|
Matthew Honnibal
|
42bd26f6f3
|
Give parser its own tok2vec weights
|
2017-08-06 18:33:46 +02:00 |
|
Matthew Honnibal
|
78498a072d
|
Return Transition for missing actions in lookup_action
|
2017-08-06 14:16:36 +02:00 |
|
Matthew Honnibal
|
bfffdeabb2
|
Fix parser batch-size bug introduced during cleanup
|
2017-08-06 14:10:48 +02:00 |
|
Matthew Honnibal
|
7f876a7a82
|
Clean up some unused code in parser
|
2017-08-06 00:00:21 +02:00 |
|
Matthew Honnibal
|
8fce187de4
|
Fix ArcEager for missing values
|
2017-08-01 22:10:05 +02:00 |
|
Matthew Honnibal
|
27abc56e98
|
Add method to get beam entities
|
2017-07-29 21:59:02 +02:00 |
|
Matthew Honnibal
|
c86445bdfd
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-07-22 01:14:28 +02:00 |
|
Matthew Honnibal
|
3da1063b36
|
Add beam decoding to parser, to allow NER uncertainties
|
2017-07-20 15:02:55 +02:00 |
|
Matthew Honnibal
|
0ca5832427
|
Improve negative example handling in NER oracle
|
2017-07-20 00:18:49 +02:00 |
|
Tpt
|
57e8254f63
|
Adds function to extract french noun chunks
|
2017-06-12 15:20:49 +02:00 |
|
Matthew Honnibal
|
6d0356e6cc
|
Whitespace
|
2017-06-04 14:55:24 -05:00 |
|
ines
|
6669583f4e
|
Use OrderedDict
|
2017-06-02 21:07:56 +02:00 |
|
ines
|
2f1025a94c
|
Port over Spanish changes from #1096
|
2017-06-02 19:09:58 +02:00 |
|
ines
|
fdd0923be4
|
Translate model=True in exclude to lower_model and upper_model
|
2017-06-02 18:37:07 +02:00 |
|
Matthew Honnibal
|
4c97371051
|
Fixes for thinc 6.7
|
2017-06-01 04:22:16 -05:00 |
|
Matthew Honnibal
|
ae8010b526
|
Move weight serialization to Thinc
|
2017-06-01 02:56:12 -05:00 |
|
Matthew Honnibal
|
097ab9c6e4
|
Fix transition system to/from disk
|
2017-05-31 13:44:00 +02:00 |
|
Matthew Honnibal
|
33e5ec737f
|
Fix to/from disk methods
|
2017-05-31 13:43:10 +02:00 |
|
Matthew Honnibal
|
53a3824334
|
Fix mistake in ner feature
|
2017-05-31 03:01:02 +02:00 |
|
Matthew Honnibal
|
cc911feab2
|
Fix bug in NER state
|
2017-05-30 22:12:19 +02:00 |
|
Matthew Honnibal
|
be4a640f0c
|
Fix arc eager label costs for uint64
|
2017-05-30 20:37:58 +02:00 |
|
Matthew Honnibal
|
aa4c33914b
|
Work on serialization
|
2017-05-29 08:40:45 -05:00 |
|
Matthew Honnibal
|
59f355d525
|
Fixes for serialization
|
2017-05-29 13:38:20 +02:00 |
|
Matthew Honnibal
|
ff26aa6c37
|
Work on to/from bytes/disk serialization methods
|
2017-05-29 11:45:45 +02:00 |
|
Matthew Honnibal
|
6b019b0540
|
Update to/from bytes methods
|
2017-05-29 10:14:20 +02:00 |
|
Matthew Honnibal
|
9239f06ed3
|
Fix german noun chunks iterator
|
2017-05-28 20:13:03 +02:00 |
|
Matthew Honnibal
|
fd9b6722a9
|
Fix noun chunks iterator for new stringstore
|
2017-05-28 20:12:10 +02:00 |
|
Matthew Honnibal
|
7996d21717
|
Fixes for new StringStore
|
2017-05-28 11:09:27 -05:00 |
|
Matthew Honnibal
|
8a24c60c1e
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-05-28 08:12:05 -05:00 |
|
Matthew Honnibal
|
bc97bc292c
|
Fix __call__ method
|
2017-05-28 08:11:58 -05:00 |
|
Matthew Honnibal
|
84e66ca6d4
|
WIP on stringstore change. 27 failures
|
2017-05-28 14:06:40 +02:00 |
|
Matthew Honnibal
|
39293ab2ee
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-05-28 11:46:57 +02:00 |
|
Matthew Honnibal
|
dd052572d4
|
Update arc eager for SBD changes
|
2017-05-28 11:46:51 +02:00 |
|
Matthew Honnibal
|
c1263a844b
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-05-27 18:32:57 -05:00 |
|
Matthew Honnibal
|
9e711c3476
|
Divide d_loss by batch size
|
2017-05-27 18:32:46 -05:00 |
|
Matthew Honnibal
|
a1d4c97fb7
|
Improve correctness of minibatching
|
2017-05-27 17:59:00 -05:00 |
|
Matthew Honnibal
|
49235017bf
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-05-27 16:34:28 -05:00 |
|
Matthew Honnibal
|
7ebd26b8aa
|
Use ordered dict to specify transitions
|
2017-05-27 15:52:20 -05:00 |
|
Matthew Honnibal
|
3eea5383a1
|
Add move_names property to parser
|
2017-05-27 15:51:55 -05:00 |
|
Matthew Honnibal
|
99316fa631
|
Use ordered dict to specify actions
|
2017-05-27 15:50:21 -05:00 |
|
Matthew Honnibal
|
655ca58c16
|
Clarifying change to StateC.clone
|
2017-05-27 15:49:37 -05:00 |
|
Matthew Honnibal
|
3d22fcaf0b
|
Return None from parser if there are no annotations
|
2017-05-26 14:02:59 -05:00 |
|
Matthew Honnibal
|
3d5a536eaa
|
Improve efficiency of parser batching
|
2017-05-26 11:31:23 -05:00 |
|
Matthew Honnibal
|
2cb7cc2db7
|
Remove commented code from parser
|
2017-05-25 14:55:09 -05:00 |
|
Matthew Honnibal
|
c245ff6b27
|
Rebatch parser inputs, with mid-sentence states
|
2017-05-25 11:18:59 -05:00 |
|
Matthew Honnibal
|
679efe79c8
|
Make parser update less hacky
|
2017-05-25 06:49:00 -05:00 |
|
Matthew Honnibal
|
e1cb5be0c7
|
Adjust dropout, depth and multi-task in parser
|
2017-05-24 20:11:41 -05:00 |
|
Matthew Honnibal
|
620df0414f
|
Fix dropout in parser
|
2017-05-23 15:20:45 -05:00 |
|
Matthew Honnibal
|
8026c183d0
|
Add hacky logic to accelerate depth=0 case in parser
|
2017-05-23 11:06:49 -05:00 |
|
Matthew Honnibal
|
a8b6d11c5b
|
Support optional maxout layer
|
2017-05-23 05:58:07 -05:00 |
|
Matthew Honnibal
|
c55b8fa7c5
|
Fix bugs in parse_batch
|
2017-05-23 05:57:52 -05:00 |
|
Matthew Honnibal
|
964707d795
|
Restore support for deeper networks in parser
|
2017-05-23 05:31:13 -05:00 |
|
Matthew Honnibal
|
6b918cc58e
|
Support making updates periodically during training
|
2017-05-23 04:23:29 -05:00 |
|
Matthew Honnibal
|
3f725ff7b3
|
Roll back changes to parser update
|
2017-05-23 04:23:05 -05:00 |
|
Matthew Honnibal
|
3959d778ac
|
Revert "Revert "WIP on improving parser efficiency""
This reverts commit 532afef4a8 .
|
2017-05-23 03:06:53 -05:00 |
|
Matthew Honnibal
|
532afef4a8
|
Revert "WIP on improving parser efficiency"
This reverts commit bdaac7ab44 .
|
2017-05-23 03:05:25 -05:00 |
|
Matthew Honnibal
|
bdaac7ab44
|
WIP on improving parser efficiency
|
2017-05-23 02:59:31 -05:00 |
|
Matthew Honnibal
|
8a9e318deb
|
Put the parsing loop in a nogil prange block
|
2017-05-22 17:58:12 -05:00 |
|
Matthew Honnibal
|
e2136232f9
|
Exclude states with no matching gold annotations from parsing
|
2017-05-22 10:30:12 -05:00 |
|
Matthew Honnibal
|
f00f821496
|
Fix pseudoprojectivity->nonproj
|
2017-05-22 06:14:42 -05:00 |
|
Matthew Honnibal
|
5d59e74cf6
|
PseudoProjectivity->nonproj
|
2017-05-22 05:49:53 -05:00 |
|
Matthew Honnibal
|
b45b4aa392
|
PseudoProjectivity --> nonproj
|
2017-05-22 05:17:44 -05:00 |
|
Matthew Honnibal
|
aae97f00e9
|
Fix nonproj import
|
2017-05-22 05:15:06 -05:00 |
|
Matthew Honnibal
|
2a5eb9f61e
|
Make nonproj methods top-level functions, instead of class methods
|
2017-05-22 04:51:08 -05:00 |
|
Matthew Honnibal
|
33e2222839
|
Remove unused code in deprojectivize
|
2017-05-22 04:51:08 -05:00 |
|
Matthew Honnibal
|
025d9bbc37
|
Fix handling of non-projective deps
|
2017-05-22 04:51:08 -05:00 |
|
Matthew Honnibal
|
1b5fa68996
|
Do pseudo-projective pre-processing for parser
|
2017-05-22 04:51:08 -05:00 |
|
Matthew Honnibal
|
1d5d9838a2
|
Fix action collection for parser
|
2017-05-22 04:51:08 -05:00 |
|
Matthew Honnibal
|
3b7c108246
|
Pass tokvecs through as a list, instead of concatenated. Also fix padding
|
2017-05-20 13:23:32 -05:00 |
|
Matthew Honnibal
|
d52b65aec2
|
Revert "Move to contiguous buffer for token_ids and d_vectors"
This reverts commit 3ff8c35a79 .
|
2017-05-20 11:26:23 -05:00 |
|
Matthew Honnibal
|
b272890a8c
|
Try to move parser to simpler PrecomputedAffine class. Currently broken -- maybe the previous change
|
2017-05-20 06:40:10 -05:00 |
|
Matthew Honnibal
|
3ff8c35a79
|
Move to contiguous buffer for token_ids and d_vectors
|
2017-05-20 04:17:30 -05:00 |
|
Matthew Honnibal
|
8b04b0af9f
|
Remove freqs from transition_system
|
2017-05-20 02:20:48 -05:00 |
|