Commit Graph

227 Commits

Author SHA1 Message Date
Eric Zhao
d61c117081 Lowest common ancestor matrix for spans and docs
Added functionality for spans and docs to get lowest common ancestor
matrix by simply calling: doc.get_lca_matrix() or
doc[:3].get_lca_matrix().
Corresponding unit tests were also added under spacy/tests/doc and
spacy/tests/spans.
Designed to address: https://github.com/explosion/spaCy/issues/969.
2017-09-03 12:22:19 -07:00
Matthew Honnibal
9750a0128c Fix Span.noun_chunks. Closes #1207 2017-07-22 14:14:57 +02:00
Matthew Honnibal
94267ec50f Fix merge conflit in printer 2017-07-22 13:35:15 +02:00
Matthew Honnibal
5916d46ba8 Avoid use of deepcopy in printer 2017-07-22 13:34:01 +02:00
Raphaël Bournhonesque
6381ebfb14 Use yield from syntax 2017-05-18 10:42:35 +02:00
Raphaël Bournhonesque
f37d078d6a Fix issue #1069 with custom hook Doc.sents definition 2017-05-18 09:59:38 +02:00
ines
9003fd25e5 Fix error messages if model is required (resolves #1051)
Rename about.__docs__ to about.__docs_models__.
2017-05-13 13:14:02 +02:00
ines
573f0ba867 Replace deepcopy 2017-05-13 12:34:14 +02:00
ines
bd428c0a70 Set defaults for light and flat kwargs 2017-05-13 12:34:05 +02:00
ines
c5669450a0 Fix formatting 2017-05-13 12:33:57 +02:00
Matthew Honnibal
b2540d2379 Merge Kengz's tree_print patch 2017-05-13 03:18:49 +02:00
Matthew Honnibal
4d98511db7 Make Span hashable. Closes #1019 2017-04-26 19:01:05 +02:00
Matthew Honnibal
6a4221a6de Allow lemma to be set from Python. Re #973 2017-04-16 18:07:53 +02:00
ines
0739ae7b76 Tidy up and fix formatting and imports 2017-04-15 13:05:15 +02:00
ines
3b667a24d4 Remove whitespace 2017-04-01 10:21:08 +02:00
ines
e71a1f4bd0 Fix download commands in error messages (see #946) 2017-04-01 10:20:57 +02:00
Matthew Honnibal
51882ee2b8 Fix check for setting ent_id in merge 2017-03-31 19:32:01 +02:00
Matthew Honnibal
fc3900e5b2 Allow ent_id to be set in Token 2017-03-31 14:00:14 +02:00
Matthew Honnibal
9720103428 Improve attribute handlign in doc.merge(). Still unsatisfying 2017-03-31 13:59:58 +02:00
Matthew Honnibal
0fefdfcbda Merge pull request #935 from ericzhao28/master
Add option to use label=ent_type in doc.merge arguments (Bug fix for issue #862)
2017-03-30 02:51:24 +02:00
Eric Zhao
aafdf6ffb8 Add option to use label karg to determine ent_type in doc.merge 2017-03-28 23:35:03 -07:00
Matthew Honnibal
28bb546939 Merge pull request #883 from ericzhao28/master
Add `lower_` and `upper_` properties to `Span` class
2017-03-16 23:35:47 +01:00
ines
66c1f194f9 Use consistent unicode declarations 2017-03-12 13:07:28 +01:00
Em
9c809efc25 Removed mapStr 2017-03-11 16:23:26 -08:00
Em
426d17167f Added string manipulation for spans 2017-03-10 16:50:02 -08:00
Roman Inflianskas
66e1109b53 Add support for Universal Dependencies v2.0 2017-03-03 13:17:34 +01:00
Matvey Ezhov
32a22291bc Small Doc.count_by documentation update
Current example doesn't work
2017-01-31 19:18:45 +03:00
Matthew Honnibal
6c665b81df Fix redundant == TAG in from_array conditional 2017-01-31 00:46:21 +11:00
Matthew Honnibal
e7f8e13cf3 Make Token hashable. Fixes #743 2017-01-16 13:27:57 +01:00
Matthew Honnibal
12cd27b821 Amend 8ae8b443f: Handle comparison with None tokens. 2017-01-11 13:03:32 +01:00
Matthew Honnibal
44e2b0100d Support TAG attribute in doc.from_array 2017-01-10 22:47:07 +01:00
Matthew Honnibal
8ae8b443f1 Add richcmp method to Token. Closes #631 2017-01-09 19:30:31 +01:00
kengz
73a38bd4d1 Merge remote-tracking branch 'upstream/master' 2016-12-30 12:19:59 -05:00
kengz
da44183ae1 move parse_tree logic to a new tokens/printers.py file 2016-12-30 12:19:18 -05:00
Matthew Honnibal
404019ad2f Fix issue #672: ent_iob_ was a string, not unicode, due to missing unicode_literals statement. 2016-12-18 22:33:53 +01:00
Matthew Honnibal
f6e356aada Add (and test) Span.sentiment attribute. By default we average token.span, but can override with custom hook. Re Issue #667 2016-12-02 11:05:50 +01:00
Matthew Honnibal
87613edf8f Add set_struct_attr staticmethod to token 2016-11-25 12:41:47 +01:00
Matthew Honnibal
fb69aa648f Merge branch 'master' of ssh://github.com/explosion/spaCy 2016-11-25 11:35:44 +01:00
Matthew Honnibal
9a03a3f85e Add get_struct_attr staticmethod to Token, to match Lexeme.get_struct_attr. 2016-11-25 11:35:17 +01:00
Pokey Rule
3e3bda142d Add noun_chunks to Span 2016-11-24 10:47:20 +00:00
tiago
b38cfd0ef9 now span.merge returns token like it says on documentation 2016-11-09 14:58:19 +00:00
Matthew Honnibal
1fb09c3dc1 Fix morphology tagger 2016-11-04 19:19:09 +01:00
Matthew Honnibal
293c79c09a Fix #595: Lemmatization was incorrect for base forms, because morphological analyser wasn't adding morphology properly. 2016-11-04 00:29:07 +01:00
Matthew Honnibal
f292f7f0e6 Fix Issue #599, by considering empty documents to be parsed and tagged. Implementation is a bit dodgy. 2016-11-02 23:48:43 +01:00
Matthew Honnibal
05a8b752a2 Fix Issue #600: Missing setters for Token attribute. 2016-11-02 23:28:59 +01:00
Matthew Honnibal
11664b9f20 Fix variable error in token 2016-11-01 13:28:00 +01:00
Matthew Honnibal
8c4d1b46ce Fix variable error in Span 2016-11-01 13:27:44 +01:00
Matthew Honnibal
e7af6b937f Fix syntax error while fixing doc strings 2016-11-01 13:27:32 +01:00
Matthew Honnibal
b86f8af0c1 Fix doc strings 2016-11-01 12:25:36 +01:00
Matthew Honnibal
4ca31b4d87 Fix clobbering of 'missing' named ent values after assigning ents. 2016-10-26 13:13:56 +02:00