Commit Graph

18 Commits

Author SHA1 Message Date
Matthew Honnibal
706305ee26 * Upd tests for new meaning of 'string' 2015-01-24 07:22:30 +11:00
Matthew Honnibal
5ed8b2b98f * Rename sic to orth 2015-01-23 02:08:25 +11:00
Matthew Honnibal
802867e96a * Revise interface to Token. Strings now have attribute names like norm1_ 2015-01-15 03:51:47 +11:00
Matthew Honnibal
0aa9860c2d * Fix string-typing in test_contractions. API is inconsistent, must fix... 2015-01-05 20:10:03 +11:00
Matthew Honnibal
ee3a71862e * Fix unicode bugs in tests 2015-01-05 17:54:54 +11:00
Matthew Honnibal
166c09832f * Upd test for Python3 2015-01-05 13:15:46 +11:00
Matthew Honnibal
81d878beb2 * Upd tests 2014-12-30 21:34:09 +11:00
Matthew Honnibal
73f200436f * Tests passing except for morphology/lemmatization stuff 2014-12-23 11:40:32 +11:00
Matthew Honnibal
199025609f * Upd contractions test 2014-12-21 20:41:13 +11:00
Matthew Honnibal
302e09018b * Work on fixing special-cases, reading them in as JSON objects so that they can specify lemmas 2014-12-09 14:48:01 +11:00
Matthew Honnibal
8f2f319c57 * Add a couple more contractions tests 2014-12-07 22:08:04 +11:00
Matthew Honnibal
13909a2e24 * Rewriting Lexeme serialization. 2014-10-29 23:19:38 +11:00
Matthew Honnibal
08ce602243 * Large refactor, particularly to Python API 2014-10-24 00:59:17 +11:00
Matthew Honnibal
fd4e61e58b * Fixed contraction tests. Need to correct problem with the way case stats and tag stats are supposed to work. 2014-08-27 20:22:33 +02:00
Matthew Honnibal
9815c7649e * Refactor around Word objects, adapting tests. Tests passing, except for string views. 2014-08-23 19:55:06 +02:00
Matthew Honnibal
01469b0888 * Refactor spacy so that chunks return arrays of lexemes, so that there is properly one lexeme per word. 2014-08-18 19:14:00 +02:00
Matthew Honnibal
25849fc926 * Generalize tokenization rules to capitals 2014-07-07 05:07:21 +02:00
Matthew Honnibal
e4263a241a * Tests passing for reorganized version 2014-07-07 04:23:46 +02:00