Matthew Honnibal
136a7a2322
Update beam_parser
2016-08-29 14:24:14 +02:00
Matthew Honnibal
be85b7f17f
Don't do dropout in NN at the moment
2016-08-10 03:42:09 +02:00
Matthew Honnibal
0fb188c76c
Minibatch beam candidates, for faster decoding
2016-08-08 01:38:50 +02:00
Matthew Honnibal
eb145dc1b8
Relax minimum gradient by factor 10. Important for learning
2016-08-08 01:38:06 +02:00
Matthew Honnibal
d3b0447898
Fix minimum gradient and beam density
2016-08-06 16:23:42 +02:00
Matthew Honnibal
2db43e9662
Pass parameter for gradient noise
2016-08-05 18:25:38 +02:00
Matthew Honnibal
d1511e816a
Shuffle histories in beam parser
2016-08-05 18:25:16 +02:00
Matthew Honnibal
de82552a13
Add config for beam density
2016-08-05 18:24:54 +02:00
Matthew Honnibal
a664aa8180
Fix beam_parser for new API
2016-07-31 19:03:10 +02:00
Matthew Honnibal
2f09b041d1
Reset is_valid and costs during beam training
2016-07-31 19:02:45 +02:00
Matthew Honnibal
3e46b491b9
Update call to beam_parser for new thinc API
2016-07-31 11:43:23 +02:00
Matthew Honnibal
86862f3586
Update parser.pyx for new thinc API
2016-07-31 11:43:04 +02:00
Matthew Honnibal
ff36cd43df
Fix call to updateC
2016-07-31 11:42:44 +02:00
Matthew Honnibal
25513b8389
Remove use of ExampleC from beam parser
2016-07-29 19:58:49 +02:00
Matthew Honnibal
6b912731f8
Refactor model for beam parser, to avoid conditionals on model type
2016-07-29 19:33:01 +02:00
Matthew Honnibal
eb8234181c
Tmp
2016-07-27 02:56:50 +02:00
Matthew Honnibal
ac63274e15
Tmp
2016-07-27 02:56:36 +02:00
Matthew Honnibal
6a98a3142f
More work on beam parser.
2016-07-26 19:13:39 +02:00
Matthew Honnibal
1ee6b468a9
* Adjust arc_eager oracle, so that recovering errors via non-monotonic actions gives negative cost. Need to test this with greedy parser.
2016-07-26 19:12:00 +02:00
Matthew Honnibal
0bf448461e
Work on beam parser, with max violation
2016-07-24 14:26:52 +02:00
Matthew Honnibal
a1281835a8
Clean up commented out code from beam parser.
2016-07-24 11:02:39 +02:00
Matthew Honnibal
476977ef62
Start work on max violation update. About to clean up commented out code.
2016-07-24 11:01:54 +02:00
Matthew Honnibal
8b4abc24e3
Fix beam parsing. Starting to work with early update.
2016-07-24 10:45:50 +02:00
Matthew Honnibal
27176c3d2f
Fix beam parser. Starting to work
2016-07-24 01:14:56 +02:00
Matthew Honnibal
e2a9a68b66
* Work on beam parser
2016-07-23 06:07:09 +02:00
Matthew Honnibal
de7c6c48d8
Working NN, but very messy. Relies on BLIS.
2016-07-20 16:28:02 +02:00
Matthew Honnibal
7c2f1a673b
* Working neural net, but features hacky. Switching to extractor.
2016-05-26 19:06:10 +02:00
Matthew Honnibal
13fad36e49
* Cosmetic change to english noun chunks iterator -- use enumerate instead of range loop
2016-05-20 10:11:05 +02:00
Wolfgang Seeker
7b78239436
add fix for German noun chunk iterator (issue #365 )
2016-05-06 01:41:26 +02:00
Matthew Honnibal
bb94022975
* Fix Issue #365 : Error introduced during noun phrase chunking, due to use of corrected PRON/PROPN/etc tags.
2016-05-06 00:21:05 +02:00
Wolfgang Seeker
dbf8f5f3ec
fix bug in StateC.set_break()
2016-05-05 15:15:34 +02:00
Wolfgang Seeker
3c44b5dc1a
call deprojectivization after parsing
2016-05-05 15:10:36 +02:00
Matthew Honnibal
472f576b82
* Deprojectivize German parses
2016-05-05 15:01:10 +02:00
Wolfgang Seeker
e4ea2bea01
fix whitespace
2016-05-04 07:40:38 +02:00
Wolfgang Seeker
5bf2fd1f78
make the code less cryptic
2016-05-03 17:19:05 +02:00
Wolfgang Seeker
a06fca9fdf
German noun chunk iterator now doesn't return tokens more than once
2016-05-03 16:58:59 +02:00
Wolfgang Seeker
7b246c13cb
reformulate noun chunk tests for English
2016-05-03 14:24:35 +02:00
Matthew Honnibal
1f1532142f
* Fix cost calculation on non-monotonic oracle
2016-05-03 00:21:08 +02:00
Matthew Honnibal
508fd1f6dc
* Refactor noun chunk iterators, so that they're simple functions. Install the iterator when the Doc is created, but allow users to write to the noun_chunk_iterator attribute. The iterator functions accept an object and yield (int start, int end, int label) triples.
2016-05-02 14:25:10 +02:00
Matthew Honnibal
77609588b6
* Fix assignment of root label to words left as root implicitly, after parsing ends.
2016-04-25 19:41:59 +00:00
Matthew Honnibal
7c2d2deaa7
* Revise transition system so that the Break transition retains sole responsibility for setting sentence boundaries. Re Issue #322
2016-04-25 19:41:59 +00:00
Wolfgang Seeker
12024b0b0a
bugfix: introducing multiple roots now updates original head's properties
...
adjust tests to rely less on statistical model
2016-04-20 16:42:41 +02:00
Wolfgang Seeker
b98cc3266d
bugfix: iterators now reset properly when called a second time
2016-04-15 17:49:16 +02:00
Wolfgang Seeker
289b10f441
remove some comments
2016-04-14 15:37:51 +02:00
Wolfgang Seeker
d99a9cbce9
different handling of space tokens
...
space tokens are now always attached to the previous non-space token
there are two exceptions:
leading space tokens are attached to the first following non-space token
in input that consists exclusively of space tokens, the last space token
is the head of all others.
2016-04-13 15:28:28 +02:00
Wolfgang Seeker
d328e0b4a8
Merge branch 'master' into space_head_bug
2016-04-11 12:11:01 +02:00
Wolfgang Seeker
80bea62842
bugfix in unit test
2016-04-08 16:46:44 +02:00
Wolfgang Seeker
1fe911cdb0
bigfix
2016-04-07 18:19:51 +02:00
Matthew Honnibal
872695759d
Merge pull request #306 from wbwseeker/german_noun_chunks
...
add German noun chunk functionality
2016-04-08 00:54:24 +10:00
Wolfgang Seeker
7195b6742d
add restrictions to L-arc and R-arc to prevent space heads
2016-03-28 10:40:52 +02:00