Matthew Honnibal
|
4d8d490547
|
* Exclude empty sentences in prepare_treebank
|
2015-05-31 01:12:46 +02:00 |
|
Matthew Honnibal
|
2d11739f28
|
* Change data format of JSON corpus, putting sentences into lists with the paragraph
|
2015-05-30 01:25:00 +02:00 |
|
Matthew Honnibal
|
784e577f45
|
* Check NER length matches conll length in prepare_treebank
|
2015-05-29 03:54:06 +02:00 |
|
Matthew Honnibal
|
5eb64eeb11
|
* Print json treebank by genre, instead of by large file
|
2015-05-28 22:40:01 +02:00 |
|
Matthew Honnibal
|
ef1333cf89
|
* Have prepare_treebank read train/dev/test IDs.
|
2015-05-27 17:35:05 +02:00 |
|
Matthew Honnibal
|
e140e03516
|
* Read in OntoNotes. Doesn't support train/test/dev split yet
|
2015-05-27 17:04:29 +02:00 |
|
Matthew Honnibal
|
32ae2cdabe
|
* In prepare_treebank, move ner into the token descriptions
|
2015-05-26 19:52:39 +02:00 |
|
Matthew Honnibal
|
61885aee76
|
* Work on prepare_treebank script, adding NER to it
|
2015-05-26 19:28:29 +02:00 |
|
Matthew Honnibal
|
bfeb29ebd1
|
* Tmp commit
|
2015-05-24 02:50:14 +02:00 |
|
Matthew Honnibal
|
983d954ef4
|
* Tmp commit, while switch to new format that assumes alignment happens during training
|
2015-05-23 17:39:04 +02:00 |
|
Matthew Honnibal
|
e0ef6b6992
|
* Fix alignment in prepare_treebank
|
2015-05-12 20:27:56 +02:00 |
|
Matthew Honnibal
|
0ad72a77ce
|
* Write JSON files, with both dependency and PSG parses
|
2015-05-12 20:27:55 +02:00 |
|
Matthew Honnibal
|
5078a32213
|
* Work on script to format training data as a JSON file.
|
2015-05-12 20:27:55 +02:00 |
|