Commit Graph

3436 Commits

Author SHA1 Message Date
Matthew Honnibal
9258db788a Revert "Have the matcher return character offsets, to handle the match better."
This reverts commit 049c937540.
2016-10-17 16:49:51 +02:00
Matthew Honnibal
7d446e5094 Revert "Update matcher test, to reflect character offset return instead of token offset."
This reverts commit f8d3e3bcfe.
2016-10-17 16:49:49 +02:00
Matthew Honnibal
4bf2c53c13 Revert "Hack on matcher tests, for new implementation."
This reverts commit dbe60644ab.
2016-10-17 16:49:48 +02:00
Matthew Honnibal
2fd97c71cc Revert "Don't try to pickle matcher."
This reverts commit 97bd0c9d00.
2016-10-17 16:49:43 +02:00
Matthew Honnibal
97bd0c9d00 Don't try to pickle matcher. 2016-10-17 16:38:40 +02:00
Matthew Honnibal
dbe60644ab Hack on matcher tests, for new implementation. 2016-10-17 16:12:22 +02:00
Matthew Honnibal
f8d3e3bcfe Update matcher test, to reflect character offset return instead of token offset. 2016-10-17 16:00:10 +02:00
Matthew Honnibal
049c937540 Have the matcher return character offsets, to handle the match better. 2016-10-17 15:58:57 +02:00
Matthew Honnibal
9b60186266 Fix doc class 2016-10-17 15:23:47 +02:00
Matthew Honnibal
6cbdc94959 Lots of updates to Matcher, to make entity handling sane. 2016-10-17 15:23:31 +02:00
Matthew Honnibal
7fd98fc91c Remove deprecation shim around str/bytes in Token. 2016-10-17 14:02:47 +02:00
Matthew Honnibal
b67697a97b Improve API for doc.merge() and span.merge(), to use keyword arguments. 2016-10-17 14:02:13 +02:00
Matthew Honnibal
fbb7f3f15c Add user_data attribute to Doc object. 2016-10-17 11:43:22 +02:00
Matthew Honnibal
c1abc8f6ed Fix deprecation stuff in Token: Remove the shim for the str/unicode semantics, and raise for has_repvec and repvec 2016-10-17 11:18:41 +02:00
Matthew Honnibal
4ba9eadf3d Merge branch 'v1.0.0-rc1' of ssh://github.com/explosion/spaCy into v1.0.0-rc1 2016-10-17 02:45:44 +02:00
Matthew Honnibal
09ab447a18 Remove tensor property from token. 2016-10-17 02:45:09 +02:00
Matthew Honnibal
5d10e2005c Defer some attributes to Doc, via getters_for_tokens attribute. 2016-10-17 02:44:49 +02:00
Matthew Honnibal
8829984efb Remove tensor attribute from Span and Token. 2016-10-17 02:44:04 +02:00
Matthew Honnibal
d15a88c66a Defer some attributes to Doc via getters_for_spans 2016-10-17 02:43:35 +02:00
Matthew Honnibal
62230dd13a Add getters_for_spans and getters_for_tokens attributes to Doc. Fix docstring 2016-10-17 02:42:51 +02:00
Matthew Honnibal
ae11ea8240 Add getters_for_tokens and getters_for_spans attributes to Doc object. 2016-10-17 02:42:05 +02:00
Matthew Honnibal
be48a7b4f3 Fix conftest for website tests. 2016-10-17 01:54:26 +02:00
Matthew Honnibal
8951bf6989 Update matcher tests 2016-10-17 01:53:24 +02:00
Matthew Honnibal
0cf4aff470 Set default path in EN/DE tests. 2016-10-17 01:52:49 +02:00
Matthew Honnibal
cd71b6b0a9 Remove test of parser pickle 2016-10-17 01:52:10 +02:00
Matthew Honnibal
5bc101006e Add cfg field to Tagger 2016-10-17 01:03:41 +02:00
Matthew Honnibal
517f090cbf Use GoldParse in tagger.update 2016-10-17 00:55:15 +02:00
Matthew Honnibal
59038f7efa Restore support for prior data format -- specifically, the labels field of the config. 2016-10-17 00:53:26 +02:00
Matthew Honnibal
c36e8676aa Move old examples 2016-10-16 21:56:32 +02:00
Ines Montani
96bbcaf730 Update Gitter links 2016-10-16 21:53:24 +02:00
Ines Montani
7d5d2c81b2 Merge pull request #526 from crawfordcomeaux/badge-alts-and-gitter
Added Gitter badge, alt text for all badges to README.rst
2016-10-16 21:51:27 +02:00
Matthew Honnibal
7887ab3b36 Fix default use of feature_templates in parser 2016-10-16 21:41:56 +02:00
Matthew Honnibal
3fba897e0f Update train_parser example 2016-10-16 21:41:14 +02:00
Matthew Honnibal
f787cd29fe Refactor the pipeline classes to make them more consistent, and remove the redundant blank() constructor. 2016-10-16 21:34:57 +02:00
Matthew Honnibal
311a985fe0 Add input error handling in Doc 2016-10-16 18:16:42 +02:00
Matthew Honnibal
4e9727b474 Use new words keyword argument in Doc. 2016-10-16 18:16:25 +02:00
Matthew Honnibal
06322ba99d Add words and spaces keyword arguments to Doc. 2016-10-16 18:13:03 +02:00
Matthew Honnibal
2508117553 Make train_parser example a bit simpler. 2016-10-16 17:58:37 +02:00
Matthew Honnibal
ca51f3b77e Use DependencyParser and EntityRecognizer in the Language class. 2016-10-16 17:58:12 +02:00
Matthew Honnibal
4574fe87c6 Add example for training parser 2016-10-16 17:05:55 +02:00
Matthew Honnibal
195d998a12 Fix GoldParse argument to tagger.update 2016-10-16 17:05:09 +02:00
Matthew Honnibal
274a4d4272 Fix queue Python property in StateClass 2016-10-16 17:04:41 +02:00
Matthew Honnibal
e8c8aa08ce Make action_name optional in StepwiseState 2016-10-16 17:04:16 +02:00
Matthew Honnibal
4bb73b1a93 Fix parser labels in pipeline 2016-10-16 17:03:22 +02:00
Matthew Honnibal
01b42c531f Update train_tagger script 2016-10-16 16:10:23 +02:00
Matthew Honnibal
a81c5a7abf Fix name of labels keyword to 'actions'. 2016-10-16 12:00:27 +02:00
Matthew Honnibal
a079677984 Fix omission of O action when creating blank entity recognizer 2016-10-16 11:43:25 +02:00
Matthew Honnibal
5444d38cc6 Update test for biluo tags 2016-10-16 11:42:45 +02:00
Matthew Honnibal
4fc56d4a31 Rename 'labels' to 'actions' in parser options 2016-10-16 11:42:26 +02:00
Matthew Honnibal
8a6b35d266 Delay binding in MakeDoc 2016-10-16 11:41:55 +02:00