Commit Graph

1419 Commits

Author SHA1 Message Date
Matthew Honnibal
be5affe390 * Fix import of sense tagger 2015-07-06 09:33:58 +02:00
Matthew Honnibal
a916f6a109 * Compile spacy.wsd module 2015-07-06 09:33:41 +02:00
Matthew Honnibal
5ec2ce4dcb * Fix spacy.wsd module 2015-07-06 09:33:26 +02:00
Matthew Honnibal
eb3057d806 * Add updated unsupervised_train script, from the wsd directory 2015-07-06 09:33:00 +02:00
Matthew Honnibal
1d21eebda4 Update gitignore for new wsd module 2015-07-06 09:32:10 +02:00
Matthew Honnibal
300eb44848 * Add corpus.py, with DocsDB class 2015-07-06 09:31:40 +02:00
Matthew Honnibal
2e4cfe5255 * Add script to train the dictionary-supervised supersense tagger 2015-07-06 09:06:22 +02:00
Matthew Honnibal
88a4e53fcb * Begin refactoring sense tagger 2015-07-06 09:01:21 +02:00
Matthew Honnibal
2133c2d299 * Don't expect WSD in gold tuples 2015-07-06 08:45:05 +02:00
Matthew Honnibal
0be251776e * Supply templates as an argument to the parser Config object 2015-07-06 08:44:39 +02:00
Matthew Honnibal
316a0772b2 * Remove WSD from gold.pyx 2015-07-06 08:43:59 +02:00
Matthew Honnibal
b61b495024 * Start adding parse features to sense_tagger 2015-07-06 08:43:24 +02:00
Matthew Honnibal
cb628ba352 * Add document features to sense_tagger. 2015-07-05 21:05:38 +02:00
Matthew Honnibal
8f0fe1a4ea * Note broken sense data in prepare_treebank 2015-07-05 21:04:57 +02:00
Matthew Honnibal
96442d9c3e * Put supersenses.json in the wordnet directory, not in a wsd directory 2015-07-05 21:03:59 +02:00
Matthew Honnibal
3eff39ff63 * Prevent supersenses from being assigned to CONJ, DET, NUM and PRON words. 2015-07-05 14:20:07 +02:00
Matthew Honnibal
9534d336ed * Ensure word senses are loaded, even if not in probabilities file 2015-07-05 11:31:07 +02:00
Matthew Honnibal
149a901ea7 * Don't use POS tags in supersense dict 2015-07-05 10:50:22 +02:00
Matthew Honnibal
4e0cd8def8 * Remove score_senses method from Scorer 2015-07-05 09:15:17 +02:00
Matthew Honnibal
211058f7a6 * Load adverb senses 2015-07-05 09:13:22 +02:00
Matthew Honnibal
427ea16b27 * Use tagdict in sense_tagger 2015-07-05 09:12:53 +02:00
Matthew Honnibal
5e0545be5c * Fix 32bit/64bit int problem when setting flags 2015-07-05 09:11:55 +02:00
Matthew Honnibal
4c6533a019 * Write a supersenses.json fil into a wsd directory in init_model 2015-07-04 17:24:32 +02:00
Matthew Honnibal
00c9acbf42 * Add hacky distribution over supersenses, using a half-assed thing like a stick-breaking process 2015-07-04 16:45:04 +02:00
Matthew Honnibal
153758bf65 * Hack on index.rst 2015-07-04 12:26:45 +02:00
Matthew Honnibal
893b5fd42c * Hack on sense tagger 2015-07-04 12:26:16 +02:00
Matthew Honnibal
389dcd3fb2 * Fix setting of supersense bits in lexeme.pyx 2015-07-04 12:25:21 +02:00
Matthew Honnibal
948ea9333a * Fix alignment of supersenses in init_model 2015-07-04 12:24:40 +02:00
Matthew Honnibal
fb68df91b8 * Work on sense tagger 2015-07-03 15:25:41 +02:00
Matthew Honnibal
2fbcdd0ea8 * Refactor sense tagger to get rid of intermediary layers 2015-07-03 13:31:11 +02:00
Matthew Honnibal
6735439abf * Fix the way supersenses are loaded from the json file 2015-07-03 13:29:22 +02:00
Matthew Honnibal
ff1f9fe246 * Fix init_model to read supersenses from wordnet, not pre-computed supersenses file 2015-07-03 13:28:39 +02:00
Matthew Honnibal
b977d60bf4 * Hack in WSD scoring 2015-07-03 09:25:52 +02:00
Matthew Honnibal
68f174b235 * Remove adjectives from supersense list. This seems to be associated with current memory errors 2015-07-03 09:24:45 +02:00
Matthew Honnibal
12dd4f745a * Add validation for argmaxing in _ml.pyx 2015-07-03 09:18:33 +02:00
Matthew Honnibal
5d933eec8e * Use the gold sense labels for training 2015-07-03 05:45:42 +02:00
Matthew Honnibal
4a60b68a24 * Add encode_sense_strs function 2015-07-03 05:45:16 +02:00
Matthew Honnibal
1be5ab200f * Add some of the sensetagger changes 2015-07-03 05:18:15 +02:00
Matthew Honnibal
b7e9c1da85 * Begin writing score_senses method 2015-07-03 05:10:52 +02:00
Matthew Honnibal
8464378a85 * Initialize Lexeme.senses to zero 2015-07-03 05:03:16 +02:00
Matthew Honnibal
e99e15574e * Add sense and sense_ properties to Token objects 2015-07-03 04:59:20 +02:00
Matthew Honnibal
8f068dc6fe * Set scores to 0 before prediction 2015-07-03 04:55:30 +02:00
Matthew Honnibal
2be517ba6d * Read in gold wsd data, as supersenses 2015-07-03 04:47:23 +02:00
Matthew Honnibal
c60cc22390 * Ignore adjective supersenses 2015-07-03 04:46:11 +02:00
Matthew Honnibal
dbcef2b76e * Read in new WSD gold data 2015-07-03 04:43:23 +02:00
Matthew Honnibal
333e414e9f * Hack prepare_treebank script to load wordnet supersenses 2015-07-02 08:31:12 +02:00
Matthew Honnibal
05146a4578 * Add script to read wordnet data for supersense stuff 2015-07-02 08:30:43 +02:00
Matthew Honnibal
2256ba7590 * Integrate sense tagger module 2015-07-02 00:54:46 +02:00
Matthew Honnibal
9c74f82d20 * Add rough sense tagger 2015-07-02 00:54:26 +02:00
Matthew Honnibal
4e830b9d41 * Add N_SENSES in senses.pxd 2015-07-02 00:54:06 +02:00