Matthew Honnibal
|
5623242b3e
|
* Adjust NER rules, so that U entries in gazetteer don't become B moves to the model
|
2015-11-12 04:48:23 +11:00 |
|
Matthew Honnibal
|
44fbdc7260
|
* Fix bug in NER transition system, that sometimes left no valid moves
|
2015-11-08 16:19:12 +01:00 |
|
Matthew Honnibal
|
e92371bb54
|
* Fix rule that made Last action invalid if there was a preset of O, since if the entity is already open, that ship has sailed.
|
2015-11-08 22:17:51 +11:00 |
|
Matthew Honnibal
|
6f47074214
|
* Make constructor of ParserModel and TaggerModel the same as AveragedPerceptron, for each pickling.
|
2015-11-07 18:25:17 +11:00 |
|
Matthew Honnibal
|
1cfa20fb17
|
* Fix sentence-final whitespace issue
|
2015-11-07 17:34:46 +11:00 |
|
Matthew Honnibal
|
888c05a7fa
|
* Fix variable naming in StepwiseState, for thinc 4.0
|
2015-11-07 11:02:44 +11:00 |
|
Matthew Honnibal
|
fc2185bfe3
|
* Fix variable naming in StepwiseState, for thinc 4.0
|
2015-11-07 10:48:31 +11:00 |
|
Matthew Honnibal
|
954442a807
|
* Fix variable naming in StepwiseState, for thinc 4.0
|
2015-11-07 10:30:45 +11:00 |
|
Matthew Honnibal
|
af70dc166a
|
* Fix Last restriction, that was supposed to prevent conflicts with presets, but was incorrect.
|
2015-11-07 09:52:00 +11:00 |
|
Matthew Honnibal
|
a06e3c8963
|
* Fix bone-headed mistake in StateClass.E
|
2015-11-07 07:35:28 +11:00 |
|
Matthew Honnibal
|
d24b8509e4
|
* Correct screw ups from the previous commits
|
2015-11-07 06:51:41 +11:00 |
|
Matthew Honnibal
|
5efad178b5
|
* Set ent tag when close entity
|
2015-11-07 06:09:25 +11:00 |
|
Matthew Honnibal
|
9285f01d26
|
* Fix broken StateClass.E tracking
|
2015-11-07 06:06:39 +11:00 |
|
Matthew Honnibal
|
19136b0e7d
|
* Add better debug message for illegal move
|
2015-11-07 05:34:37 +11:00 |
|
Matthew Honnibal
|
2733816b7b
|
* Fix whitespace
|
2015-11-07 05:31:06 +11:00 |
|
Matthew Honnibal
|
01ab464383
|
* Prevent Begin and In moves from applying in NER if we're at the last token of a sentence, as this would mean the entity would span over a sentence boundary. Re Issue #169
|
2015-11-07 05:30:44 +11:00 |
|
Matthew Honnibal
|
b65633f270
|
* Fix function that returns nth entity in StateClass. Was only returning the first.
|
2015-11-07 05:29:11 +11:00 |
|
Matthew Honnibal
|
3c162dcac3
|
* Refactor away from the _ml module, to use thinc 4.0. Still some work needs to be done, e.g. to add __reduce__ to the models, more testing, etc.
|
2015-11-07 03:24:30 +11:00 |
|
Matthew Honnibal
|
b9991fbd20
|
* Update to use thinc 3.0
|
2015-11-06 00:25:59 +11:00 |
|
Matthew Honnibal
|
68f479e821
|
* Rename Doc.data to Doc.c
|
2015-11-04 00:15:14 +11:00 |
|
Matthew Honnibal
|
329ae57520
|
* Fix whitespace attachment thing
|
2015-10-13 09:46:38 +02:00 |
|
Matthew Honnibal
|
37919eac82
|
* Fix whitespace attachment in simpler way. Leaves problem with setting left/right children.
|
2015-10-13 18:23:24 +11:00 |
|
Matthew Honnibal
|
c70eb776ae
|
* Fix whitespace attachment, so that left/right children are consistent with head.
|
2015-10-13 15:58:22 +11:00 |
|
Matthew Honnibal
|
20fd36a0f7
|
* Very scrappy, likely buggy first-cut pickle implementation, to work on Issue #125: allow pickle for Apache Spark. The current implementation sends stuff to temp files, and does almost nothing to ensure all modifiable state is actually preserved. The Language() instance is a deep tree of extension objects, and if pickling during training, some of the C-data state is hard to preserve.
|
2015-10-13 13:44:41 +11:00 |
|
Matthew Honnibal
|
9dd2f25c74
|
* Fix Issue #131: Force whitespace characters to attach syntactically to previous token, and ensure they cannot serve as stand-alone 'sentence' units.
|
2015-10-10 15:53:30 +11:00 |
|
Matthew Honnibal
|
8b39feefbe
|
* Add dependency post-process rule to ensure spaces are attached to neighbouring tokens, so that they can't be sentence boundaries
|
2015-10-10 15:32:13 +11:00 |
|
Matthew Honnibal
|
0e24d099a1
|
* Fix L/R edge bug, by ensuring l_edge and r_edge are preset, and fixing the way the edge update in del_arc. Bugs keep arising here because the edges are absolute positions, where everything else is relative. I'm also not 100% convinced that del_arc is handled correctly. Do we need to update the parents?
|
2015-09-09 03:40:44 +02:00 |
|
Matthew Honnibal
|
86c888667f
|
* Merge in changes from de branch
|
2015-09-06 19:49:28 +02:00 |
|
Matthew Honnibal
|
5edac11225
|
* Wrap self.parse in nogil, and break if an invalid move is predicted. The invalid break is a work-around that papers over likely bugs, but we can't easily break in the nogil block, and otherwise we'll get an infinite loop. Need to set this as an error flag.
|
2015-09-06 04:15:00 +02:00 |
|
Matthew Honnibal
|
a3d5e6c0dd
|
* Reform constructor and save/load workflow in parser model
|
2015-08-26 19:19:01 +02:00 |
|
Matthew Honnibal
|
bf38b3b883
|
* Hack on l/r reversal bug
|
2015-08-10 05:58:43 +02:00 |
|
Matthew Honnibal
|
6116413b47
|
* Fix label prediction in StepwiseState
|
2015-08-10 05:05:31 +02:00 |
|
Matthew Honnibal
|
2c9753eff2
|
* Whitespace
|
2015-08-10 00:09:02 +02:00 |
|
Matthew Honnibal
|
9de98f5a6f
|
* Add Parser.stepthrough method, with context manager
|
2015-08-10 00:08:46 +02:00 |
|
Matthew Honnibal
|
fe43f8cf39
|
* Whitespace
|
2015-08-09 02:31:53 +02:00 |
|
Matthew Honnibal
|
9c090945e0
|
* Add Parser.predict method, and clean up Parser.get_state
|
2015-08-09 02:29:58 +02:00 |
|
Matthew Honnibal
|
04fccfb984
|
* Fix get_state for parser prediction
|
2015-08-09 02:11:22 +02:00 |
|
Matthew Honnibal
|
55fde0e240
|
* Fix get_state
|
2015-08-09 01:45:30 +02:00 |
|
Matthew Honnibal
|
f0f4fa9838
|
* Fix Parser.get_state
|
2015-08-09 01:40:13 +02:00 |
|
Matthew Honnibal
|
18331dca89
|
* Add continue_for argument to parser 'partial' function, which is now renamed to get_state
|
2015-08-09 01:31:54 +02:00 |
|
Matthew Honnibal
|
0653288fa5
|
* Fix stateclass.queue
|
2015-08-09 00:39:02 +02:00 |
|
Matthew Honnibal
|
9de218b7ba
|
* Fix Parser.partial function
|
2015-08-08 23:45:18 +02:00 |
|
Matthew Honnibal
|
cc9deae960
|
* Add is_valid method to transition_system
|
2015-08-08 23:36:18 +02:00 |
|
Matthew Honnibal
|
2a46c77324
|
* Whitespace
|
2015-08-08 23:35:59 +02:00 |
|
Matthew Honnibal
|
7bafc789e7
|
* Add stack and queue properties to stateclass, for python access
|
2015-08-08 23:32:42 +02:00 |
|
Matthew Honnibal
|
3af938365f
|
* Add function partial to Parser
|
2015-08-08 23:32:15 +02:00 |
|
Matthew Honnibal
|
76a1f0481a
|
* Whitespace
|
2015-08-08 23:31:54 +02:00 |
|
Matthew Honnibal
|
59c3bf60a6
|
* Ensure entity recognizer doesn't over-write preset types
|
2015-08-06 16:09:08 +02:00 |
|
Matthew Honnibal
|
9c1724ecae
|
* Gazetteer stuff working, now need to wire up to API
|
2015-08-06 00:35:40 +02:00 |
|
Matthew Honnibal
|
a8bbd7312c
|
* Hackishly patch long dependencies problem
|
2015-07-28 00:14:29 +02:00 |
|