Commit Graph

96 Commits

Author SHA1 Message Date
Matthew Honnibal
793430aa7a Get spaCy train command working with neural network
* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab
2017-05-17 12:04:50 +02:00
ines
9d85cda8e4 Fix models error message and use about.__docs_models__ (see #1051) 2017-05-13 13:05:47 +02:00
ines
564939391a Remove spacy.orth 2017-05-09 01:21:47 +02:00
ines
d24589aa72 Clean up imports, unused code, whitespace, docstrings 2017-04-15 12:05:47 +02:00
ines
561f2a3eb4 Use consistent formatting for docstrings 2017-04-15 11:59:21 +02:00
ines
3b667a24d4 Remove whitespace 2017-04-01 10:21:08 +02:00
ines
e71a1f4bd0 Fix download commands in error messages (see #946) 2017-04-01 10:20:57 +02:00
Matthew Honnibal
b86f8af0c1 Fix doc strings 2016-11-01 12:25:36 +01:00
Matthew Honnibal
bea44bd3c4 Fix vector_norm when vector is assigned to Lexeme. 2016-10-23 14:23:56 +02:00
Matthew Honnibal
ed5e178817 Add sentiment property on lexeme object 2016-10-19 20:52:52 +02:00
Matthew Honnibal
e233328d38 Fix Issue #371: Lexeme objects were unhashable. 2016-09-27 13:22:30 +02:00
Matthew Honnibal
17137f5c0c * Fix issue #372: mistake in Lexeme rich comparison 2016-05-12 12:58:57 +02:00
Matthew Honnibal
e31df66d26 * Fix Issue #361: Lexemes didn't have rich comparison. 2016-05-05 01:32:26 +02:00
Wolfgang Seeker
d65ef41d08 make error messages language independent 2016-03-24 11:47:09 +01:00
Wolfgang Seeker
03fb498dbe introduce lang field for LexemeC to hold language id
put noun_chunk logic into iterators.py for each language separately
2016-03-10 13:01:34 +01:00
Matthew Honnibal
419edfab50 * Use generic flags for the new attributes until they're added 2016-02-04 15:50:54 +01:00
Matthew Honnibal
11810be33e * Add Python hooks for is_bracket/is_quote/is_left_punct/is_right_punct 2016-02-04 13:04:16 +01:00
Matthew Honnibal
ab5aac5b2f * Add .rank property to Token and Lexeme, for frequency rank 2015-11-08 16:18:25 +01:00
Matthew Honnibal
1e99fcd413 * Rename .repvec to .vector in C API 2015-11-03 23:47:59 +11:00
Matthew Honnibal
f7283a5067 * Fix vectors bugs for OOV words 2015-09-22 02:10:25 +02:00
Matthew Honnibal
44aecba701 * Fix Token.has_vector and Lexeme.has_vector 2015-09-22 01:43:16 +02:00
Matthew Honnibal
596fde8daa * Add has_vector attribute to Token and Lexeme 2015-09-21 19:52:43 +10:00
Matthew Honnibal
f32927efbf * Raise exceptions if attempt to access parse, but data is not installed. This partly but not fully addresses Issue #97. Still need exceptions on the various Token attributes that access the parse tree, e.g. token.head, token.lefts, token.rights, etc. Exceptions should be centralized, too. 2015-09-21 18:35:40 +10:00
Matthew Honnibal
191d593e03 * Fix vectors bug in lexeme 2015-09-15 19:05:11 +10:00
Matthew Honnibal
dd4d64b235 * Support setting of word vectors on Lexeme object. 2015-09-15 14:42:27 +10:00
Matthew Honnibal
193f127f81 * Fix ugly py_check_flag and py_set_flag functions in Lexeme 2015-09-15 13:06:18 +10:00
Matthew Honnibal
9561d88529 * Add is_stop to Python API 2015-09-14 18:25:40 +10:00
Matthew Honnibal
65dc0d1dfb * Extend word vectors support, with .similarity() function, vector_norm property, and rename repvec to vector. Keep repvec name as well for now for backwards compatibility. 2015-09-14 17:49:58 +10:00
Matthew Honnibal
07c09a0e1b * Fix attribute getters and setters in Lexeme 2015-09-09 14:29:22 +02:00
Matthew Honnibal
86c888667f * Merge in changes from de branch 2015-09-06 19:49:28 +02:00
Matthew Honnibal
d2fc104a26 * Begin merge of Gazetteer and DE branches 2015-09-06 19:45:15 +02:00
Matthew Honnibal
7cc56ada6e * Temporarily add py_set_flag attribute in Lexeme 2015-09-06 17:52:51 +02:00
Matthew Honnibal
3acf60df06 * Add missing properties in Lexeme class 2015-08-26 19:16:28 +02:00
Matthew Honnibal
6f1743692a * Work on language-independent refactoring 2015-08-23 20:49:18 +02:00
Matthew Honnibal
cad0cca4e3 * Tmp 2015-08-22 22:04:34 +02:00
Matthew Honnibal
8e4c69ee8c * Add is_oov property, and fix up handling of attributes 2015-07-27 01:50:06 +02:00
Matthew Honnibal
6bb96c122d * Host IS_ flags in attrs.pxd, and add properties for them on Token and Lexeme objects 2015-07-26 16:37:16 +02:00
Matthew Honnibal
3c270fc8ff * Remove has_sense method from Lexeme 2015-07-08 19:28:29 +02:00
Matthew Honnibal
b64c843861 * Remove senses attr 2015-07-08 19:26:24 +02:00
Matthew Honnibal
e23d1582a2 * Add supersense data to Lexeme objects. Add simple has_sense method to check the flag. 2015-07-01 18:50:37 +02:00
Jordan Suchow
3a8d9b37a6 Remove trailing whitespace 2015-04-19 13:01:38 -07:00
Matthew Honnibal
51b618d646 * Add a has_repvec property to Lexeme, and a check function to check flags 2015-02-07 08:42:44 -05:00
Matthew Honnibal
6d1c08dafd * Add docstring to Lexeme 2015-01-24 20:48:34 +11:00
Matthew Honnibal
fda94271af * Rename NORM1 and NORM2 attrs to lower and norm 2015-01-24 06:17:03 +11:00
Matthew Honnibal
5ed8b2b98f * Rename sic to orth 2015-01-23 02:08:25 +11:00
Matthew Honnibal
5e63c606ad * Rename vec to repvec 2015-01-22 02:03:54 +11:00
Matthew Honnibal
6c7e44140b * Work on word vectors, and other stuff 2015-01-17 16:21:17 +11:00
Matthew Honnibal
7d3c40de7d * Tests passing after refactor. API has obvious warts, particularly in Token and Lexeme 2015-01-15 00:33:16 +11:00
Matthew Honnibal
0930892fc1 * Tmp. Working on refactor. Compiles, must hook up lexical feats. 2015-01-14 00:03:48 +11:00
Matthew Honnibal
46da3d74d2 * Tmp. Refactoring, introducing a Lexeme PyObject. 2015-01-12 11:23:44 +11:00