Commit Graph

1628 Commits

Author SHA1 Message Date
Matthew Honnibal
b3b180010b Tmp. Working on NN NER. 2016-09-08 13:00:13 +02:00
Matthew Honnibal
2bfe184692 Fiddle with nll loss in parser 2016-09-04 16:56:47 +02:00
Matthew Honnibal
b73295557c Fix regression on cost 2016-08-30 11:00:27 +02:00
Matthew Honnibal
1a1b2f9174 Restore use of Example object in parser.train 2016-08-29 14:26:43 +02:00
Matthew Honnibal
14f3da2a2e Fix set_features on _neural 2016-08-29 14:26:07 +02:00
Matthew Honnibal
52d2702782 Fix set_features on tagger 2016-08-29 14:25:22 +02:00
Matthew Honnibal
136a7a2322 Update beam_parser 2016-08-29 14:24:14 +02:00
Matthew Honnibal
be85b7f17f Don't do dropout in NN at the moment 2016-08-10 03:42:09 +02:00
Matthew Honnibal
0fb188c76c Minibatch beam candidates, for faster decoding 2016-08-08 01:38:50 +02:00
Matthew Honnibal
eb145dc1b8 Relax minimum gradient by factor 10. Important for learning 2016-08-08 01:38:06 +02:00
Matthew Honnibal
d3b0447898 Fix minimum gradient and beam density 2016-08-06 16:23:42 +02:00
Matthew Honnibal
2db43e9662 Pass parameter for gradient noise 2016-08-05 18:25:38 +02:00
Matthew Honnibal
d1511e816a Shuffle histories in beam parser 2016-08-05 18:25:16 +02:00
Matthew Honnibal
de82552a13 Add config for beam density 2016-08-05 18:24:54 +02:00
Matthew Honnibal
a664aa8180 Fix beam_parser for new API 2016-07-31 19:03:10 +02:00
Matthew Honnibal
2f09b041d1 Reset is_valid and costs during beam training 2016-07-31 19:02:45 +02:00
Matthew Honnibal
3e46b491b9 Update call to beam_parser for new thinc API 2016-07-31 11:43:23 +02:00
Matthew Honnibal
86862f3586 Update parser.pyx for new thinc API 2016-07-31 11:43:04 +02:00
Matthew Honnibal
ff36cd43df Fix call to updateC 2016-07-31 11:42:44 +02:00
Matthew Honnibal
5869f05bd6 Update tagger for new thinc API 2016-07-31 11:41:48 +02:00
Matthew Honnibal
25513b8389 Remove use of ExampleC from beam parser 2016-07-29 19:58:49 +02:00
Matthew Honnibal
6b912731f8 Refactor model for beam parser, to avoid conditionals on model type 2016-07-29 19:33:01 +02:00
Matthew Honnibal
2dda2ecdbd Merge branch 'master' of ssh://github.com/spacy-io/spaCy into july16 2016-07-27 03:15:36 +02:00
Matthew Honnibal
eb8234181c Tmp 2016-07-27 02:56:50 +02:00
Matthew Honnibal
ac63274e15 Tmp 2016-07-27 02:56:36 +02:00
Matthew Honnibal
6a98a3142f More work on beam parser. 2016-07-26 19:13:39 +02:00
Matthew Honnibal
1ee6b468a9 * Adjust arc_eager oracle, so that recovering errors via non-monotonic actions gives negative cost. Need to test this with greedy parser. 2016-07-26 19:12:00 +02:00
Matthew Honnibal
0bf448461e Work on beam parser, with max violation 2016-07-24 14:26:52 +02:00
Matthew Honnibal
a1281835a8 Clean up commented out code from beam parser. 2016-07-24 11:02:39 +02:00
Matthew Honnibal
476977ef62 Start work on max violation update. About to clean up commented out code. 2016-07-24 11:01:54 +02:00
Matthew Honnibal
8b4abc24e3 Fix beam parsing. Starting to work with early update. 2016-07-24 10:45:50 +02:00
Matthew Honnibal
27176c3d2f Fix beam parser. Starting to work 2016-07-24 01:14:56 +02:00
Matthew Honnibal
e2a9a68b66 * Work on beam parser 2016-07-23 06:07:09 +02:00
Matthew Honnibal
de7c6c48d8 Working NN, but very messy. Relies on BLIS. 2016-07-20 16:28:02 +02:00
Adam Ever Hadani
f1c0762443 exit code 0 for when downloading a model that already was downloaded 2016-07-13 16:22:14 -07:00
Matthew Honnibal
7c2f1a673b * Working neural net, but features hacky. Switching to extractor. 2016-05-26 19:06:10 +02:00
Matthew Honnibal
cdc10e9a1c * Fix Issue #375: noun phrase iteration results in index error if noun phrases are merged during the loop. Fix by accumulating the spans inside the noun_chunks property, allowing the Span index tricks to work. 2016-05-20 10:14:06 +02:00
Matthew Honnibal
13fad36e49 * Cosmetic change to english noun chunks iterator -- use enumerate instead of range loop 2016-05-20 10:11:05 +02:00
Matthew Honnibal
02276cc444 Merge branch 'master' of ssh://github.com/spacy-io/spaCy 2016-05-17 16:56:22 +02:00
Matthew Honnibal
4d7f5468bb * Change Language class to use a .pipeline attribute, instead of having the pipeline hard coded 2016-05-17 16:55:42 +02:00
Daylen Yang
5405e7dd73 Fix get_lang_class parsing (take 2) 2016-05-16 16:40:31 -07:00
Matthew Honnibal
b240104f40 Revert "Fix get_lang_class parsing" 2016-05-17 08:04:26 +10:00
Daylen Yang
1692c2df3c Fix get_lang_class parsing
We want the get_lang_class to return "en" for both "en" and "en_glove_cc_300_1m_vectors". Changed the split rule to "_" so that this happens.
2016-05-16 14:38:20 -07:00
Matthew Honnibal
17137f5c0c * Fix issue #372: mistake in Lexeme rich comparison 2016-05-12 12:58:57 +02:00
Matthew Honnibal
cc8bf62208 * Fix Issue #360: Tokenizer failed when the infix regex matched the start of the string while trying to tokenize multi-infix tokens. 2016-05-09 13:23:47 +02:00
Matthew Honnibal
c61ee8f9fa * Increment version 2016-05-09 13:20:00 +02:00
Matthew Honnibal
5d86c30f0b * Fix Issue #367: Missing has_vector property on Doc and Span objects 2016-05-09 12:36:14 +02:00
Wolfgang Seeker
7b78239436 add fix for German noun chunk iterator (issue #365) 2016-05-06 01:41:26 +02:00
Matthew Honnibal
8c0888d6cb * Fix error in span.sent 2016-05-06 00:28:05 +02:00
Matthew Honnibal
bb94022975 * Fix Issue #365: Error introduced during noun phrase chunking, due to use of corrected PRON/PROPN/etc tags. 2016-05-06 00:21:05 +02:00