Commit Graph

800 Commits

Author SHA1 Message Date
Matthew Honnibal
b61b495024 * Start adding parse features to sense_tagger 2015-07-06 08:43:24 +02:00
Matthew Honnibal
cb628ba352 * Add document features to sense_tagger. 2015-07-05 21:05:38 +02:00
Matthew Honnibal
3eff39ff63 * Prevent supersenses from being assigned to CONJ, DET, NUM and PRON words. 2015-07-05 14:20:07 +02:00
Matthew Honnibal
149a901ea7 * Don't use POS tags in supersense dict 2015-07-05 10:50:22 +02:00
Matthew Honnibal
4e0cd8def8 * Remove score_senses method from Scorer 2015-07-05 09:15:17 +02:00
Matthew Honnibal
427ea16b27 * Use tagdict in sense_tagger 2015-07-05 09:12:53 +02:00
Matthew Honnibal
5e0545be5c * Fix 32bit/64bit int problem when setting flags 2015-07-05 09:11:55 +02:00
Matthew Honnibal
00c9acbf42 * Add hacky distribution over supersenses, using a half-assed thing like a stick-breaking process 2015-07-04 16:45:04 +02:00
Matthew Honnibal
893b5fd42c * Hack on sense tagger 2015-07-04 12:26:16 +02:00
Matthew Honnibal
389dcd3fb2 * Fix setting of supersense bits in lexeme.pyx 2015-07-04 12:25:21 +02:00
Matthew Honnibal
fb68df91b8 * Work on sense tagger 2015-07-03 15:25:41 +02:00
Matthew Honnibal
2fbcdd0ea8 * Refactor sense tagger to get rid of intermediary layers 2015-07-03 13:31:11 +02:00
Matthew Honnibal
6735439abf * Fix the way supersenses are loaded from the json file 2015-07-03 13:29:22 +02:00
Matthew Honnibal
b977d60bf4 * Hack in WSD scoring 2015-07-03 09:25:52 +02:00
Matthew Honnibal
68f174b235 * Remove adjectives from supersense list. This seems to be associated with current memory errors 2015-07-03 09:24:45 +02:00
Matthew Honnibal
12dd4f745a * Add validation for argmaxing in _ml.pyx 2015-07-03 09:18:33 +02:00
Matthew Honnibal
5d933eec8e * Use the gold sense labels for training 2015-07-03 05:45:42 +02:00
Matthew Honnibal
4a60b68a24 * Add encode_sense_strs function 2015-07-03 05:45:16 +02:00
Matthew Honnibal
1be5ab200f * Add some of the sensetagger changes 2015-07-03 05:18:15 +02:00
Matthew Honnibal
b7e9c1da85 * Begin writing score_senses method 2015-07-03 05:10:52 +02:00
Matthew Honnibal
8464378a85 * Initialize Lexeme.senses to zero 2015-07-03 05:03:16 +02:00
Matthew Honnibal
e99e15574e * Add sense and sense_ properties to Token objects 2015-07-03 04:59:20 +02:00
Matthew Honnibal
8f068dc6fe * Set scores to 0 before prediction 2015-07-03 04:55:30 +02:00
Matthew Honnibal
2be517ba6d * Read in gold wsd data, as supersenses 2015-07-03 04:47:23 +02:00
Matthew Honnibal
dbcef2b76e * Read in new WSD gold data 2015-07-03 04:43:23 +02:00
Matthew Honnibal
05146a4578 * Add script to read wordnet data for supersense stuff 2015-07-02 08:30:43 +02:00
Matthew Honnibal
2256ba7590 * Integrate sense tagger module 2015-07-02 00:54:46 +02:00
Matthew Honnibal
9c74f82d20 * Add rough sense tagger 2015-07-02 00:54:26 +02:00
Matthew Honnibal
4e830b9d41 * Add N_SENSES in senses.pxd 2015-07-02 00:54:06 +02:00
Matthew Honnibal
041908a272 * Merge neuralnet branch into sense-tagger 2015-07-01 22:38:22 +02:00
Matthew Honnibal
52fd80c6c6 * Add experimental supersense features for parsing, based on lookup into wordnet. 2015-07-01 20:12:44 +02:00
Matthew Honnibal
e6d828a9af * Set up an array POS_SENSES that denotes the set of valid senses for each POS tag. This way, we can do bitwise & between a lexeme's senses and the ones available for its POS tag, to get the allowable senses for the token. 2015-07-01 20:12:13 +02:00
Matthew Honnibal
2b8459d9a8 * Add senses flag to Lexeme 2015-07-01 20:10:41 +02:00
Matthew Honnibal
e23d1582a2 * Add supersense data to Lexeme objects. Add simple has_sense method to check the flag. 2015-07-01 18:50:37 +02:00
Matthew Honnibal
64fafa98be * Add senses.pyx and senses.pxd 2015-07-01 18:49:44 +02:00
Matthew Honnibal
94dab94e5f uerge branch 'master' of https://github.com/honnibal/spaCy 2015-06-30 18:16:26 +02:00
Matthew Honnibal
9af86b0b0b * Fix attrs.pxd 2015-06-30 18:16:30 +02:00
Matthew Honnibal
af9c82f7a6 Merge branch 'master' of https://github.com/honnibal/spaCy 2015-06-30 18:11:37 +02:00
Matthew Honnibal
5d595b5a8c * Inc versions 2015-06-30 18:11:06 +02:00
Matthew Honnibal
d2eeba6667 * Start wiring up color and emotion lexicons. Hopefully we get to use them. 2015-06-30 16:22:23 +02:00
Matthew Honnibal
e20106fdff * Begin reorganizing neuralnet work 2015-06-30 14:26:32 +02:00
Matthew Honnibal
5cd3ed42d4 * Reenable averaging 2015-06-29 16:44:42 +02:00
Matthew Honnibal
894cbef8ba * Wire eta and mu parameters up for neural net 2015-06-29 07:10:33 +02:00
Matthew Honnibal
3bb5876c5a * Inline methods in StateClass 2015-06-29 01:10:14 +02:00
Matthew Honnibal
313a7f87b3 * Inline methods in StateClass 2015-06-29 01:06:28 +02:00
Matthew Honnibal
a02fd3af5d * Check valency in L and R feature methods, to make feaure calculation faster 2015-06-29 00:27:56 +02:00
Matthew Honnibal
5d870720bc * Check valency in L and R feature methods, to make feaure calculation faster 2015-06-29 00:17:29 +02:00
Matthew Honnibal
f4986d5d3c * Use new Example class 2015-06-28 22:36:03 +02:00
Matthew Honnibal
735f1af91f * Fix neural net stuff 2015-06-28 11:44:58 +02:00
Matthew Honnibal
e7003f1cf3 * Remove hard-coding of vector lengths 2015-06-28 11:37:17 +02:00