Commit Graph

32 Commits

Author SHA1 Message Date
Matthew Honnibal
e77940565d * Add length cap to distance feature 2015-05-31 05:25:30 +02:00
Matthew Honnibal
fd596351ba * Fix valency features 2015-05-31 05:24:33 +02:00
Matthew Honnibal
fb8d50b3d5 Merge branch 'master' of ssh://github.com/honnibal/spaCy 2015-04-30 12:45:15 +02:00
Matthew Honnibal
ed8e8c3bd0 * Whitespace 2015-04-29 14:22:47 +02:00
Matthew Honnibal
763ef01575 * Fix two bugs in feature calculation 2015-04-28 23:25:09 +02:00
Jordan Suchow
3a8d9b37a6 Remove trailing whitespace 2015-04-19 13:01:38 -07:00
Matthew Honnibal
9f16848b60 * Add (N0w, N1w) unigram pair to NER features, prompted by failure to detect 'this weekend' 2015-04-15 06:01:18 +02:00
Matthew Honnibal
1d05e6da00 * Add ne_iob and ne_type features to NER 2015-04-10 19:07:08 +02:00
Matthew Honnibal
4df8a3d90f * Add ne_iob and ne_type attributes to context vector 2015-04-10 05:02:15 +02:00
Matthew Honnibal
99c9ecfc18 * Fix bug in prefix, suffix and word shape features in parser and NER 2015-04-10 03:53:33 +02:00
Matthew Honnibal
1320bd19db * Move Span class to own file 2015-03-26 16:45:38 +01:00
Matthew Honnibal
b3157927e6 * Clean up unused feature templates 2015-03-26 16:44:47 +01:00
Matthew Honnibal
01c892f583 * Add comment to fill_context 2015-03-26 16:44:47 +01:00
Matthew Honnibal
2741179aff * Important bug fix: Fill token N2w, which was being unfilled, after a bad edit while writing the NER features. 2015-03-26 16:44:47 +01:00
Matthew Honnibal
e181c051d5 * Improve features for NER 2015-03-26 16:44:44 +01:00
Matthew Honnibal
8057a95f20 * NER seems to be working, scoring 69 F. Need to add decision-history features --- currently only use current word, 2 words context. Need refactoring. 2015-03-26 16:44:44 +01:00
Matthew Honnibal
d81b7be6a2 * Merge train.py 2015-03-26 16:44:41 +01:00
Matthew Honnibal
5ed8b2b98f * Rename sic to orth 2015-01-23 02:08:25 +11:00
Matthew Honnibal
6c7e44140b * Work on word vectors, and other stuff 2015-01-17 16:21:17 +11:00
Matthew Honnibal
3f1944d688 * Make PyPy work 2015-01-05 17:54:38 +11:00
Matthew Honnibal
aafaf58cbe * Refactor _ml.Model, and finish implementing HastyModel so far not worthwhile. 2014-12-31 19:40:59 +11:00
Matthew Honnibal
bb80937544 * Upd docstrings 2014-12-27 18:45:16 +11:00
Matthew Honnibal
b8b65903fc * Tmp 2014-12-24 17:42:00 +11:00
Matthew Honnibal
bed680c632 * Remove commented-out features 2014-12-20 03:47:32 +11:00
Matthew Honnibal
3d178c03ae * Prune the features a bit 2014-12-20 02:46:14 +11:00
Matthew Honnibal
a2f2a48da9 * Add some extra features 2014-12-20 01:42:24 +11:00
Matthew Honnibal
6ab7e40590 * Add non-monotonic parsing with cost-sensitive update. 92.26 on Y&M set 2014-12-18 11:33:25 +11:00
Matthew Honnibal
61142a8eff * Tweak features 2014-12-18 09:15:03 +11:00
Matthew Honnibal
8446ebfbbb * Work on parser. Up to 92 UAS on YM labels 2014-12-18 09:05:31 +11:00
Matthew Honnibal
9d7d97978d * Work on greedy parser 2014-12-17 21:09:29 +11:00
Matthew Honnibal
d524dd306a * Work on greedy parser 2014-12-17 03:19:43 +11:00
Matthew Honnibal
95ccea03b2 * Work on greedy parser 2014-12-16 22:46:55 +11:00