Commit Graph

399 Commits

Author SHA1 Message Date
Matthew Honnibal
136a7a2322 Update beam_parser 2016-08-29 14:24:14 +02:00
Matthew Honnibal
be85b7f17f Don't do dropout in NN at the moment 2016-08-10 03:42:09 +02:00
Matthew Honnibal
0fb188c76c Minibatch beam candidates, for faster decoding 2016-08-08 01:38:50 +02:00
Matthew Honnibal
eb145dc1b8 Relax minimum gradient by factor 10. Important for learning 2016-08-08 01:38:06 +02:00
Matthew Honnibal
d3b0447898 Fix minimum gradient and beam density 2016-08-06 16:23:42 +02:00
Matthew Honnibal
2db43e9662 Pass parameter for gradient noise 2016-08-05 18:25:38 +02:00
Matthew Honnibal
d1511e816a Shuffle histories in beam parser 2016-08-05 18:25:16 +02:00
Matthew Honnibal
de82552a13 Add config for beam density 2016-08-05 18:24:54 +02:00
Matthew Honnibal
a664aa8180 Fix beam_parser for new API 2016-07-31 19:03:10 +02:00
Matthew Honnibal
2f09b041d1 Reset is_valid and costs during beam training 2016-07-31 19:02:45 +02:00
Matthew Honnibal
3e46b491b9 Update call to beam_parser for new thinc API 2016-07-31 11:43:23 +02:00
Matthew Honnibal
86862f3586 Update parser.pyx for new thinc API 2016-07-31 11:43:04 +02:00
Matthew Honnibal
ff36cd43df Fix call to updateC 2016-07-31 11:42:44 +02:00
Matthew Honnibal
25513b8389 Remove use of ExampleC from beam parser 2016-07-29 19:58:49 +02:00
Matthew Honnibal
6b912731f8 Refactor model for beam parser, to avoid conditionals on model type 2016-07-29 19:33:01 +02:00
Matthew Honnibal
eb8234181c Tmp 2016-07-27 02:56:50 +02:00
Matthew Honnibal
ac63274e15 Tmp 2016-07-27 02:56:36 +02:00
Matthew Honnibal
6a98a3142f More work on beam parser. 2016-07-26 19:13:39 +02:00
Matthew Honnibal
1ee6b468a9 * Adjust arc_eager oracle, so that recovering errors via non-monotonic actions gives negative cost. Need to test this with greedy parser. 2016-07-26 19:12:00 +02:00
Matthew Honnibal
0bf448461e Work on beam parser, with max violation 2016-07-24 14:26:52 +02:00
Matthew Honnibal
a1281835a8 Clean up commented out code from beam parser. 2016-07-24 11:02:39 +02:00
Matthew Honnibal
476977ef62 Start work on max violation update. About to clean up commented out code. 2016-07-24 11:01:54 +02:00
Matthew Honnibal
8b4abc24e3 Fix beam parsing. Starting to work with early update. 2016-07-24 10:45:50 +02:00
Matthew Honnibal
27176c3d2f Fix beam parser. Starting to work 2016-07-24 01:14:56 +02:00
Matthew Honnibal
e2a9a68b66 * Work on beam parser 2016-07-23 06:07:09 +02:00
Matthew Honnibal
de7c6c48d8 Working NN, but very messy. Relies on BLIS. 2016-07-20 16:28:02 +02:00
Matthew Honnibal
7c2f1a673b * Working neural net, but features hacky. Switching to extractor. 2016-05-26 19:06:10 +02:00
Matthew Honnibal
13fad36e49 * Cosmetic change to english noun chunks iterator -- use enumerate instead of range loop 2016-05-20 10:11:05 +02:00
Wolfgang Seeker
7b78239436 add fix for German noun chunk iterator (issue #365) 2016-05-06 01:41:26 +02:00
Matthew Honnibal
bb94022975 * Fix Issue #365: Error introduced during noun phrase chunking, due to use of corrected PRON/PROPN/etc tags. 2016-05-06 00:21:05 +02:00
Wolfgang Seeker
dbf8f5f3ec fix bug in StateC.set_break() 2016-05-05 15:15:34 +02:00
Wolfgang Seeker
3c44b5dc1a call deprojectivization after parsing 2016-05-05 15:10:36 +02:00
Matthew Honnibal
472f576b82 * Deprojectivize German parses 2016-05-05 15:01:10 +02:00
Wolfgang Seeker
e4ea2bea01 fix whitespace 2016-05-04 07:40:38 +02:00
Wolfgang Seeker
5bf2fd1f78 make the code less cryptic 2016-05-03 17:19:05 +02:00
Wolfgang Seeker
a06fca9fdf German noun chunk iterator now doesn't return tokens more than once 2016-05-03 16:58:59 +02:00
Wolfgang Seeker
7b246c13cb reformulate noun chunk tests for English 2016-05-03 14:24:35 +02:00
Matthew Honnibal
1f1532142f * Fix cost calculation on non-monotonic oracle 2016-05-03 00:21:08 +02:00
Matthew Honnibal
508fd1f6dc * Refactor noun chunk iterators, so that they're simple functions. Install the iterator when the Doc is created, but allow users to write to the noun_chunk_iterator attribute. The iterator functions accept an object and yield (int start, int end, int label) triples. 2016-05-02 14:25:10 +02:00
Matthew Honnibal
77609588b6 * Fix assignment of root label to words left as root implicitly, after parsing ends. 2016-04-25 19:41:59 +00:00
Matthew Honnibal
7c2d2deaa7 * Revise transition system so that the Break transition retains sole responsibility for setting sentence boundaries. Re Issue #322 2016-04-25 19:41:59 +00:00
Wolfgang Seeker
12024b0b0a bugfix: introducing multiple roots now updates original head's properties
adjust tests to rely less on statistical model
2016-04-20 16:42:41 +02:00
Wolfgang Seeker
b98cc3266d bugfix: iterators now reset properly when called a second time 2016-04-15 17:49:16 +02:00
Wolfgang Seeker
289b10f441 remove some comments 2016-04-14 15:37:51 +02:00
Wolfgang Seeker
d99a9cbce9 different handling of space tokens
space tokens are now always attached to the previous non-space token
there are two exceptions:
leading space tokens are attached to the first following non-space token
in input that consists exclusively of space tokens, the last space token
is the head of all others.
2016-04-13 15:28:28 +02:00
Wolfgang Seeker
d328e0b4a8 Merge branch 'master' into space_head_bug 2016-04-11 12:11:01 +02:00
Wolfgang Seeker
80bea62842 bugfix in unit test 2016-04-08 16:46:44 +02:00
Wolfgang Seeker
1fe911cdb0 bigfix 2016-04-07 18:19:51 +02:00
Matthew Honnibal
872695759d Merge pull request #306 from wbwseeker/german_noun_chunks
add German noun chunk functionality
2016-04-08 00:54:24 +10:00
Wolfgang Seeker
7195b6742d add restrictions to L-arc and R-arc to prevent space heads 2016-03-28 10:40:52 +02:00