ines
|
9e83513004
|
Add position of invalid token to error message
|
2018-03-27 23:56:59 +02:00 |
|
ines
|
693971dd8f
|
Improve error message if token text is empty string (see #2101)
|
2018-03-27 22:25:40 +02:00 |
|
ines
|
0c829e6605
|
Fix whitespace
|
2018-03-27 22:20:59 +02:00 |
|
Matthew Honnibal
|
63a267b34d
|
Fix #2073: Token.set_extension not working
|
2018-03-27 13:36:20 +02:00 |
|
Thomas Opsomer
|
fbf48b3f9f
|
lemma property to return hash instead of unicode
|
2018-03-14 17:03:00 +01:00 |
|
4altinok
|
ca8728035d
|
added new lex feat to token
|
2018-02-11 18:55:48 +01:00 |
|
Thomas Opsomer
|
515e25910e
|
fix sent_start in serialization
|
2018-01-28 19:50:42 +01:00 |
|
Matthew Honnibal
|
56164ab688
|
Set l_edge and r_edge correctly for non-projective parses. Fixes #1799
|
2018-01-22 20:18:04 +01:00 |
|
Matthew Honnibal
|
ccb51a9f36
|
Make .similarity() return 1.0 if all orth attrs match
|
2018-01-15 16:29:48 +01:00 |
|
Matthew Honnibal
|
b904d81e9a
|
Fix rich comparison against None objects. Closes #1757
|
2018-01-15 15:51:25 +01:00 |
|
Matthew Honnibal
|
ab7c45b12d
|
Fix error message and handling of doc.sents
|
2018-01-15 15:21:11 +01:00 |
|
Matthew Honnibal
|
465a6f6452
|
Add missing Span.vocab property. Closes #1633
|
2018-01-14 15:06:30 +01:00 |
|
Matthew Honnibal
|
0cb090e526
|
Fix infinite recursion in token.sent_start. Closes #1640
|
2018-01-14 15:02:15 +01:00 |
|
Matthew Honnibal
|
5cbe913b6f
|
Don't raise deprecation warning in property. Closes #1813, #1712
|
2018-01-14 14:55:58 +01:00 |
|
Matthew Honnibal
|
e10e9ad2c5
|
Improve efficiency of Doc.to_array
|
2017-11-23 12:33:27 +00:00 |
|
Matthew Honnibal
|
fa62427300
|
Remove lookup-based lemmatization
|
2017-11-23 12:32:22 +00:00 |
|
Matthew Honnibal
|
fb26b2cb12
|
Use lookup lemmatizer if lemma unset
|
2017-11-23 12:31:58 +00:00 |
|
Burton DeWilde
|
a5c6869b2d
|
Fix bug where span.orth_ != span.text (see #1612)
|
2017-11-20 12:05:43 -06:00 |
|
Motoki Wu
|
a52e195a0a
|
Fixes Issue #1207 where noun_chunks of Span gives an error.
Make sure to reference `self.doc` when getting the noun chunks.
Same fix as 9750a0128c
|
2017-11-17 17:16:20 -08:00 |
|
ines
|
1c218397f6
|
Ensure path in Doc.to_disk/from_disk (resolves ##1521)
Also add Doc serialization tests with both Path and string path options
|
2017-11-09 02:29:03 +01:00 |
|
Matthew Honnibal
|
144a93c2a5
|
Back-off to tensor for similarity if no vectors
|
2017-11-03 20:56:33 +01:00 |
|
Matthew Honnibal
|
62ed58935a
|
Add Doc.extend_tensor() method
|
2017-11-03 11:20:31 +01:00 |
|
ines
|
9659391944
|
Update deprecated methods and add warnings
|
2017-11-01 16:49:42 +01:00 |
|
ines
|
705a4e3e4a
|
Fix formatting
|
2017-11-01 16:44:08 +01:00 |
|
Matthew Honnibal
|
9e0ebee81c
|
Add Token.is_sent_start property, so can deprecate Token.sent_start
|
2017-11-01 13:27:14 +01:00 |
|
Matthew Honnibal
|
7e7116cdf7
|
Fix Doc.to_array when only one string attr provided
|
2017-11-01 13:26:43 +01:00 |
|
Matthew Honnibal
|
301fb2bb60
|
Implement Span.n_lefts and Span.n_rights
|
2017-11-01 13:25:12 +01:00 |
|
Matthew Honnibal
|
86eba61fae
|
Fix token.vector when vectors are missing
|
2017-11-01 00:47:35 +01:00 |
|
ines
|
d96e72f656
|
Tidy up rest
|
2017-10-27 21:07:59 +02:00 |
|
ines
|
d2df81d907
|
Fix not implemented Span getters
|
2017-10-27 18:09:28 +02:00 |
|
ines
|
544a407b93
|
Tidy up Doc, Token and Span and add missing docs
|
2017-10-27 17:07:26 +02:00 |
|
ines
|
6a0483b7aa
|
Tidy up and document Doc, Token and Span
|
2017-10-27 15:41:45 +02:00 |
|
ines
|
1a559d4c95
|
Remove old, unused file
|
2017-10-27 15:34:35 +02:00 |
|
ines
|
ea4a41c8fb
|
Tidy up util and helpers
|
2017-10-27 14:39:09 +02:00 |
|
Matthew Honnibal
|
b66b8f028b
|
Fix #1375 -- out-of-bounds on token.nbor()
|
2017-10-24 12:10:39 +02:00 |
|
Matthew Honnibal
|
ccd2ab1a62
|
Merge pull request #1443 from ramananbalakrishnan/develop-get-lca-matrix
Add LCA matrix for spans and docs
|
2017-10-24 11:22:46 +02:00 |
|
Matthew Honnibal
|
fdf25d10ba
|
Merge pull request #1440 from ramananbalakrishnan/develop
Support single value for attribute list in doc.to_array
|
2017-10-24 10:23:12 +02:00 |
|
ines
|
a31f048b4d
|
Fix formatting
|
2017-10-23 10:38:06 +02:00 |
|
Ramanan Balakrishnan
|
d2fe56a577
|
Add LCA matrix for spans and docs
|
2017-10-20 23:58:00 +05:30 |
|
Ramanan Balakrishnan
|
0726946563
|
cleanup to_array implementation using fixes on master
|
2017-10-20 17:09:37 +05:30 |
|
Ramanan Balakrishnan
|
b3ab124fc5
|
Support strings for attribute list in doc.to_array
|
2017-10-20 11:46:57 +05:30 |
|
Ramanan Balakrishnan
|
7b9b1be44c
|
Support single value for attribute list in doc.to_array
|
2017-10-19 17:00:41 +05:30 |
|
Matthew Honnibal
|
394633efce
|
Make doc pickling support hooks
|
2017-10-17 19:44:09 +02:00 |
|
Matthew Honnibal
|
cdb0c426d8
|
Improve deserialization of user_data, esp. for Underscore
|
2017-10-17 19:29:20 +02:00 |
|
Matthew Honnibal
|
32a8564c79
|
Fix doc pickling
|
2017-10-17 18:20:24 +02:00 |
|
Matthew Honnibal
|
92c1eb2d6f
|
Fix Doc pickling. This also removes need for Binder class
|
2017-10-17 16:11:13 +02:00 |
|
Matthew Honnibal
|
a002264fec
|
Remove caching of Token in Doc, as caused cycle.
|
2017-10-16 19:34:21 +02:00 |
|
Matthew Honnibal
|
59c216196c
|
Allow weakrefs on Doc objects
|
2017-10-16 19:22:11 +02:00 |
|
ines
|
e0ff145a8b
|
Merge branch 'develop' into feature/dot-underscore
|
2017-10-11 11:57:05 +02:00 |
|
Matthew Honnibal
|
3b527fa52b
|
Call morphology.assign_untagged when pushing token to Doc
|
2017-10-11 03:23:57 +02:00 |
|