Matthew Honnibal
3e688e6d4b
Fix issue #514 -- serializer fails when new entity type has been added. The fix here is quite ugly. It's best to add the entities ASAP after loading the NLP pipeline, to mitigate the brittleness.
2016-10-23 17:45:44 +02:00
Matthew Honnibal
59038f7efa
Restore support for prior data format -- specifically, the labels field of the config.
2016-10-17 00:53:26 +02:00
Matthew Honnibal
7887ab3b36
Fix default use of feature_templates in parser
2016-10-16 21:41:56 +02:00
Matthew Honnibal
f787cd29fe
Refactor the pipeline classes to make them more consistent, and remove the redundant blank() constructor.
2016-10-16 21:34:57 +02:00
Matthew Honnibal
274a4d4272
Fix queue Python property in StateClass
2016-10-16 17:04:41 +02:00
Matthew Honnibal
e8c8aa08ce
Make action_name optional in StepwiseState
2016-10-16 17:04:16 +02:00
Matthew Honnibal
4fc56d4a31
Rename 'labels' to 'actions' in parser options
2016-10-16 11:42:26 +02:00
Matthew Honnibal
3259a63779
Whitespace
2016-10-16 01:47:28 +02:00
Matthew Honnibal
d9ae2d68af
Load features by string-name for backwards compatibility.
2016-10-12 20:15:11 +02:00
Matthew Honnibal
3a03c668c3
Fix message in ParserStateError
2016-10-12 14:44:31 +02:00
Matthew Honnibal
6bf505e865
Fix error on ParserStateError
2016-10-12 14:35:55 +02:00
Matthew Honnibal
ea23b64cc8
Refactor training, with new spacy.train module. Defaults still a little awkward.
2016-10-09 12:24:24 +02:00
Matthew Honnibal
1d70db58aa
Revert "Changes to iterators.pyx for new StringStore scheme"
...
This reverts commit 4f794b215a
.
2016-09-30 20:19:53 +02:00
Matthew Honnibal
9e09b39b9f
Revert "Changes to transition systems for new StringStore scheme"
...
This reverts commit 0442e0ab1e
.
2016-09-30 20:11:49 +02:00
Matthew Honnibal
e3285f6f30
Revert "Fix report of ParserStateError"
...
This reverts commit 78f19baafa
.
2016-09-30 20:11:33 +02:00
Matthew Honnibal
78f19baafa
Fix report of ParserStateError
2016-09-30 19:59:22 +02:00
Matthew Honnibal
0442e0ab1e
Changes to transition systems for new StringStore scheme
2016-09-30 19:58:51 +02:00
Matthew Honnibal
4f794b215a
Changes to iterators.pyx for new StringStore scheme
2016-09-30 19:57:49 +02:00
Matthew Honnibal
4cbf0d3bb6
Handle errors when no valid actions are available, pointing users to the issue tracker.
2016-09-27 19:19:53 +02:00
Matthew Honnibal
430473bd98
Raise errors when no actions are available, re Issue #429
2016-09-27 19:09:37 +02:00
Matthew Honnibal
8e7df3c4ca
Expect the parser data, if parser.load() is called.
2016-09-27 14:02:12 +02:00
Matthew Honnibal
a44763af0e
Fix Issue #469 : Incorrectly cased root label in noun chunk iterator
2016-09-27 13:13:01 +02:00
Matthew Honnibal
e07b9665f7
Don't expect parser model
2016-09-26 18:09:33 +02:00
Matthew Honnibal
ee6fa106da
Fix parser features
2016-09-26 17:57:32 +02:00
Matthew Honnibal
e607e4b598
Fix parser loading
2016-09-26 17:51:11 +02:00
Matthew Honnibal
2debc4e0a2
Add .blank() method to Parser. Start housing default dep labels and entity types within the Defaults class.
2016-09-26 11:57:54 +02:00
Matthew Honnibal
fd65cf6cbb
Finish refactoring data loading
2016-09-24 20:26:17 +02:00
Matthew Honnibal
83e364188c
Mostly finished loading refactoring. Design is in place, but doesn't work yet.
2016-09-24 15:42:01 +02:00
Matthew Honnibal
60fdf4d5f1
Remove commented out debuggng code
2016-09-24 01:17:18 +02:00
Matthew Honnibal
070af4af9d
Revert "* Working neural net, but features hacky. Switching to extractor."
...
This reverts commit 7c2f1a673b
.
2016-09-21 12:26:14 +02:00
Matthew Honnibal
7c2f1a673b
* Working neural net, but features hacky. Switching to extractor.
2016-05-26 19:06:10 +02:00
Matthew Honnibal
13fad36e49
* Cosmetic change to english noun chunks iterator -- use enumerate instead of range loop
2016-05-20 10:11:05 +02:00
Wolfgang Seeker
7b78239436
add fix for German noun chunk iterator (issue #365 )
2016-05-06 01:41:26 +02:00
Matthew Honnibal
bb94022975
* Fix Issue #365 : Error introduced during noun phrase chunking, due to use of corrected PRON/PROPN/etc tags.
2016-05-06 00:21:05 +02:00
Wolfgang Seeker
dbf8f5f3ec
fix bug in StateC.set_break()
2016-05-05 15:15:34 +02:00
Wolfgang Seeker
3c44b5dc1a
call deprojectivization after parsing
2016-05-05 15:10:36 +02:00
Matthew Honnibal
472f576b82
* Deprojectivize German parses
2016-05-05 15:01:10 +02:00
Wolfgang Seeker
e4ea2bea01
fix whitespace
2016-05-04 07:40:38 +02:00
Wolfgang Seeker
5bf2fd1f78
make the code less cryptic
2016-05-03 17:19:05 +02:00
Wolfgang Seeker
a06fca9fdf
German noun chunk iterator now doesn't return tokens more than once
2016-05-03 16:58:59 +02:00
Wolfgang Seeker
7b246c13cb
reformulate noun chunk tests for English
2016-05-03 14:24:35 +02:00
Matthew Honnibal
1f1532142f
* Fix cost calculation on non-monotonic oracle
2016-05-03 00:21:08 +02:00
Matthew Honnibal
508fd1f6dc
* Refactor noun chunk iterators, so that they're simple functions. Install the iterator when the Doc is created, but allow users to write to the noun_chunk_iterator attribute. The iterator functions accept an object and yield (int start, int end, int label) triples.
2016-05-02 14:25:10 +02:00
Matthew Honnibal
77609588b6
* Fix assignment of root label to words left as root implicitly, after parsing ends.
2016-04-25 19:41:59 +00:00
Matthew Honnibal
7c2d2deaa7
* Revise transition system so that the Break transition retains sole responsibility for setting sentence boundaries. Re Issue #322
2016-04-25 19:41:59 +00:00
Wolfgang Seeker
12024b0b0a
bugfix: introducing multiple roots now updates original head's properties
...
adjust tests to rely less on statistical model
2016-04-20 16:42:41 +02:00
Wolfgang Seeker
b98cc3266d
bugfix: iterators now reset properly when called a second time
2016-04-15 17:49:16 +02:00
Wolfgang Seeker
289b10f441
remove some comments
2016-04-14 15:37:51 +02:00
Wolfgang Seeker
d99a9cbce9
different handling of space tokens
...
space tokens are now always attached to the previous non-space token
there are two exceptions:
leading space tokens are attached to the first following non-space token
in input that consists exclusively of space tokens, the last space token
is the head of all others.
2016-04-13 15:28:28 +02:00
Wolfgang Seeker
d328e0b4a8
Merge branch 'master' into space_head_bug
2016-04-11 12:11:01 +02:00