Matthew Honnibal
|
761a19113a
|
* Fix /tmp moving thing in download.py
|
2015-04-12 07:04:10 +02:00 |
|
Matthew Honnibal
|
ed1907b4df
|
* Add pragmatic sentence boundary detection tests, from that Ruby gem. Not automatically run, as they can arbitrarily fail based on model changes. Currently 8/15 fail.
|
2015-04-12 04:46:40 +02:00 |
|
Matthew Honnibal
|
0c25001325
|
* Fix specials.json
|
2015-04-12 04:45:41 +02:00 |
|
Matthew Honnibal
|
1629b33082
|
* Fix copying of tokenizer data in init_model
|
2015-04-12 04:45:31 +02:00 |
|
Matthew Honnibal
|
248a2b4b0f
|
* Remove Spans class
|
2015-04-12 04:07:29 +02:00 |
|
Matthew Honnibal
|
1d05e6da00
|
* Add ne_iob and ne_type features to NER
|
2015-04-10 19:07:08 +02:00 |
|
Matthew Honnibal
|
4df8a3d90f
|
* Add ne_iob and ne_type attributes to context vector
|
2015-04-10 05:02:15 +02:00 |
|
Matthew Honnibal
|
8c354c432b
|
* Add ValueError condition to ner_tag reading
|
2015-04-10 04:59:59 +02:00 |
|
Matthew Honnibal
|
435cccf098
|
* Add read_conll03_file function to conll.pyx
|
2015-04-10 04:59:11 +02:00 |
|
Matthew Honnibal
|
99c9ecfc18
|
* Fix bug in prefix, suffix and word shape features in parser and NER
|
2015-04-10 03:53:33 +02:00 |
|
Matthew Honnibal
|
a6ac92f077
|
* Respect the model_dir input parameter to train.py
|
2015-04-08 22:48:26 +02:00 |
|
Matthew Honnibal
|
ed8942a096
|
* Add train function to fabfile
|
2015-04-08 22:47:59 +02:00 |
|
Matthew Honnibal
|
baff0f8ad8
|
* Add docstring explaining script a bit, and add handling of word vectors
|
2015-04-08 08:20:15 +02:00 |
|
Matthew Honnibal
|
c0a3e25b43
|
* Upd gitignore
|
2015-04-08 07:48:04 +02:00 |
|
Matthew Honnibal
|
156b70ed82
|
* Add new script to replace make_lexicon, that does full setup of data
|
2015-04-08 07:46:53 +02:00 |
|
Matthew Honnibal
|
e775e05313
|
* Use merge_mwe=False in evaluation in train.py
|
2015-04-08 00:35:19 +02:00 |
|
Matthew Honnibal
|
cff2b13fef
|
* Fix Issue #44: Broken Token.string attribute when single word sentence
|
2015-04-07 06:08:25 +02:00 |
|
Matthew Honnibal
|
085574ccc1
|
* Add test for Issue #44
|
2015-04-07 06:05:18 +02:00 |
|
Matthew Honnibal
|
6640386b25
|
* Fix Issue #43: TAG attr not supported. Also add DEP attr, while I'm at it. Need better way of ensuring future changes don't break in similar way.
|
2015-04-07 06:00:57 +02:00 |
|
Matthew Honnibal
|
b64b2bd910
|
* Fix Issue #43: TAG attr not supported. Also add DEP attr, while I'm at it. Need better way of ensuring future changes don't break in similar way.
|
2015-04-07 06:00:30 +02:00 |
|
Matthew Honnibal
|
6674d719a5
|
* Test for Issue #43: TAG attribute not working in array export
|
2015-04-07 05:53:50 +02:00 |
|
Matthew Honnibal
|
f9e510a893
|
* Whitespace
|
2015-04-07 04:53:59 +02:00 |
|
Matthew Honnibal
|
66c7ccf6cc
|
* Fix Spans.orth_
|
2015-04-07 04:53:40 +02:00 |
|
Matthew Honnibal
|
3b5ea3731a
|
* Add tests for Span stuff
|
2015-04-07 04:52:25 +02:00 |
|
Matthew Honnibal
|
c2b9a61ee2
|
* Upd merge test
|
2015-04-07 04:51:31 +02:00 |
|
Matthew Honnibal
|
b8d34531c4
|
* Add support for units to English.__init__, by loading and applying regular expressions
|
2015-04-07 04:02:32 +02:00 |
|
Matthew Honnibal
|
0ea5af88b6
|
* Add multi-word expression RegexMatcher
|
2015-04-07 03:45:40 +02:00 |
|
Matthew Honnibal
|
2fee67cfa3
|
* Add regular expressions for English multi-word expressions
|
2015-04-07 03:45:18 +02:00 |
|
Matthew Honnibal
|
5a075ea3fc
|
* Ensure NER moves are available for single-word tokens
|
2015-04-05 22:30:58 +02:00 |
|
Matthew Honnibal
|
a60a366b2c
|
* Support 'punct' dep label in conll.pyx
|
2015-04-05 22:30:19 +02:00 |
|
Matthew Honnibal
|
021c972137
|
* Print parse if verbose in scorer
|
2015-04-05 22:29:30 +02:00 |
|
Matthew Honnibal
|
f26f381b0e
|
* Add simple ner_tag script
|
2015-04-03 17:26:58 +02:00 |
|
Matthew Honnibal
|
bb27979352
|
* Add prepare_vecs script
|
2015-04-02 06:19:39 +02:00 |
|
Matthew Honnibal
|
fbf19049cf
|
* Add ent_type_ property
|
2015-03-31 02:01:29 +02:00 |
|
Matthew Honnibal
|
3f1e17bd3c
|
* Add tests for new merge() method
|
2015-03-30 01:37:57 +02:00 |
|
Matthew Honnibal
|
e70b87efeb
|
* Add merge() method to Tokens, with fairly brittle/hacky implementation, but quite easy to test. Passing minimal tests. Still need to fix left/right deps in C data
|
2015-03-30 01:37:41 +02:00 |
|
Matthew Honnibal
|
557856e84c
|
* Allow regular expressions to specify labels for merged spans
|
2015-03-27 17:40:52 +01:00 |
|
Matthew Honnibal
|
a3af6b7c3d
|
* Left-Arc from Root, to allow non-monotonic reduce to compete with left-arc when the stack is not empty.
|
2015-03-27 17:39:16 +01:00 |
|
Matthew Honnibal
|
db5a43318c
|
* Improve print_state debug printer
|
2015-03-27 17:29:58 +01:00 |
|
Matthew Honnibal
|
1705eccbbe
|
* Remove whitespace
|
2015-03-27 15:22:39 +01:00 |
|
Matthew Honnibal
|
3feb52374c
|
* Break apart a condition, for ease of debug printing
|
2015-03-27 15:21:38 +01:00 |
|
Matthew Honnibal
|
b32f581acb
|
* Fix bug in ArcEager.get_labels
|
2015-03-27 15:21:06 +01:00 |
|
Matthew Honnibal
|
cd054c6c9f
|
* Remove stray print statement
|
2015-03-27 15:20:42 +01:00 |
|
Matthew Honnibal
|
5f2a4ff36d
|
* Fix spans.lemma_
|
2015-03-26 16:45:38 +01:00 |
|
Matthew Honnibal
|
f4cc222ec3
|
* Fix NER scoring
|
2015-03-26 16:45:38 +01:00 |
|
Matthew Honnibal
|
1320bd19db
|
* Move Span class to own file
|
2015-03-26 16:45:38 +01:00 |
|
Matthew Honnibal
|
6f47a667cf
|
* Move Span class to own file
|
2015-03-26 16:45:38 +01:00 |
|
Matthew Honnibal
|
f02c39dfaf
|
* Compare to is not None, for more robustness
|
2015-03-26 16:44:48 +01:00 |
|
Matthew Honnibal
|
8f68b864c4
|
* Move Span/Spans to separate files. Currently duplicates lots of Tokens functionality. Should probably be integrated into Tokens
|
2015-03-26 16:44:48 +01:00 |
|
Matthew Honnibal
|
056c672caf
|
* Bug fixes to tokenization, and support for times
|
2015-03-26 16:44:48 +01:00 |
|