Matthew Honnibal
6cc79fc244
Fix state array length for split
2018-04-03 15:43:23 +02:00
Matthew Honnibal
83ca2113a2
Constrain Break action to stack depth==1
2018-04-01 13:47:02 +02:00
Matthew Honnibal
5da7945917
Allocate StateC.was_split
2018-04-01 13:44:42 +02:00
Matthew Honnibal
e887b2330e
Rewrite oracle to not use fast-forward. Seems to work?
2018-04-01 10:43:11 +02:00
Matthew Honnibal
e5ad35787c
WIP on adding split-token actions to parser
...
This patch starts getting the StateC object ready to split tokens. The
split function is implemented by pushing indices into the buffer that
indicate an out-of-length token.
Still todo:
* Update the oracles
* Update GoldParseC
* Interpret the parse once it's complete
* Add retokenizer.split() method
2018-03-31 20:05:27 +02:00
Matthew Honnibal
e826b85cf0
Fix state.split() function
2018-03-30 13:25:28 +02:00
Matthew Honnibal
d399843576
WIP on split parsing
2018-03-28 01:44:05 +02:00
Matthew Honnibal
1f7229f40f
Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
...
This reverts commit c9ba3d3c2d , reversing
changes made to 92c26a35d4 .
2018-03-27 19:23:02 +02:00
Matthew Honnibal
a0c7dabb72
Fix bug in 8-token parser features
2017-10-28 23:01:35 +00:00
Matthew Honnibal
dd5b2d8fa3
Check for out-of-memory when calling calloc. Closes #1446
2017-10-24 12:40:47 +02:00
Matthew Honnibal
e938bce320
Adjust parsing transition system to allow preset sentence segments.
2017-10-08 23:53:34 +02:00
Matthew Honnibal
5c750a9c2f
Reserve 0 for 'missing' in history features
2017-10-06 06:10:13 -05:00
Matthew Honnibal
b0618def8d
Add support for 2-token state option
2017-10-05 21:54:12 -05:00
Matthew Honnibal
ee41e4fea7
Support history features in stateclass
2017-10-03 12:43:48 +02:00
Matthew Honnibal
664c5af745
Revert padding in parser
2017-09-14 16:59:25 +02:00
Matthew Honnibal
c6395b057a
Improve parser feature extraction, for missing values
2017-09-14 16:18:02 +02:00
Matthew Honnibal
c307a0ffb8
Restore patches from nn-beam-parser to spacy/syntax
2017-08-18 22:38:59 +02:00
Matthew Honnibal
5f81d700ff
Restore patches from nn-beam-parser to spacy/syntax
2017-08-18 22:23:03 +02:00
Matthew Honnibal
426f84937f
Resolve conflicts when merging new beam parsing stuff
2017-08-18 13:38:32 -05:00
Matthew Honnibal
3533bb61cb
Add option of 8 feature parse state
2017-08-16 18:23:27 -05:00
Matthew Honnibal
52c180ecf5
Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
...
This reverts commit ea8de11ad5 , reversing
changes made to 08e443e083 .
2017-08-14 13:00:23 +02:00
Matthew Honnibal
d4308d2363
Initialize State offset to 0
2017-08-12 17:14:39 -05:00
Matthew Honnibal
53a3824334
Fix mistake in ner feature
2017-05-31 03:01:02 +02:00
Matthew Honnibal
cc911feab2
Fix bug in NER state
2017-05-30 22:12:19 +02:00
Matthew Honnibal
7996d21717
Fixes for new StringStore
2017-05-28 11:09:27 -05:00
Matthew Honnibal
655ca58c16
Clarifying change to StateC.clone
2017-05-27 15:49:37 -05:00
Matthew Honnibal
3d5a536eaa
Improve efficiency of parser batching
2017-05-26 11:31:23 -05:00
Matthew Honnibal
8a9e318deb
Put the parsing loop in a nogil prange block
2017-05-22 17:58:12 -05:00
Matthew Honnibal
a9edb3aa1d
Improve integration of NN parser, to support unified training API
2017-05-15 21:53:27 +02:00
Matthew Honnibal
c79b3129e3
Fix setting of empty lexeme in initial parse state
2017-03-15 09:26:53 -05:00
Matthew Honnibal
d59c6926c1
I think this fixes the segfault
2017-03-11 06:58:34 -06:00
Matthew Honnibal
318b9e32ff
WIP on beam parser. Currently segfaults.
2017-03-11 06:19:52 -06:00
Wolfgang Seeker
dbf8f5f3ec
fix bug in StateC.set_break()
2016-05-05 15:15:34 +02:00
Wolfgang Seeker
289b10f441
remove some comments
2016-04-14 15:37:51 +02:00
Wolfgang Seeker
d99a9cbce9
different handling of space tokens
...
space tokens are now always attached to the previous non-space token
there are two exceptions:
leading space tokens are attached to the first following non-space token
in input that consists exclusively of space tokens, the last space token
is the head of all others.
2016-04-13 15:28:28 +02:00
Matthew Honnibal
1b83cb9dfa
* Fix Issue #251 : Incorrect right edge calculation on left-clobber low in the tree
2016-02-07 00:00:42 +01:00
Matthew Honnibal
4412a70dc5
* Initialize StateC._empty_token to 0, to avoid undefined behaviour.
2016-02-06 13:34:38 +01:00
Matthew Honnibal
e3db39dd21
* Fix compiler warning about signed/unsigned comparison
2016-02-01 09:08:07 +01:00
Matthew Honnibal
a47f00901b
* Pass a StateC pointer into the transition and validation methods in the parser, so that the GIL can be released over a batch of documents
2016-02-01 02:58:14 +01:00
Matthew Honnibal
7a0e3bb9c1
* Continue proxying. Some problem currently
2016-02-01 02:22:21 +01:00
Matthew Honnibal
2fa228458e
* Add _state file, which StateClass will proxy to
2016-02-01 01:09:21 +01:00
Matthew Honnibal
7f9384f53c
* Remove deprecated _state module
2015-06-23 17:28:24 +02:00
Matthew Honnibal
77d7e79c7e
* Fix r/l and distance features.
2015-06-12 13:06:15 +02:00
Matthew Honnibal
dd0867645d
* Remove stray const from State header
2015-06-03 00:10:04 +02:00
Matthew Honnibal
e09a08bd00
* Add copy_state function
2015-06-01 23:06:30 +02:00
Matthew Honnibal
9dfc9c039c
* Work on constituency parsing.
2015-05-20 16:02:51 +02:00
Matthew Honnibal
03a6626545
* Tmp commit
2015-05-12 20:27:56 +02:00
Matthew Honnibal
d2ac8d8007
* Add ctnt field to State, in preparation for constituency parsing
2015-05-12 20:27:56 +02:00
Matthew Honnibal
a4e2af54f9
* Add support for l/r edge to add_dep, and move inlined methods into _state.pyx where possible
2015-05-12 20:26:41 +02:00
Matthew Honnibal
a3af6b7c3d
* Left-Arc from Root, to allow non-monotonic reduce to compete with left-arc when the stack is not empty.
2015-03-27 17:39:16 +01:00