Commit Graph

9566 Commits

Author SHA1 Message Date
Matthew Honnibal
893aa40d73 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2019-02-24 16:43:01 +01:00
Matthew Honnibal
5882d82915 Set version to v2.1.0a9.dev2 2019-02-24 16:42:06 +01:00
Matthew Honnibal
0367f864fe Fix handling of added labels. Resolves #3189 2019-02-24 16:41:41 +01:00
Matthew Honnibal
4dc57d9e15 Update train_new_entity_type example 2019-02-24 16:41:03 +01:00
Matthew Honnibal
d74dbde828 Fix order of actions when labels added to parser
When labels were added to the parser or NER, we weren't loading back the
classes in the correct order. Re issue #3189
2019-02-24 16:36:29 +01:00
Matthew Honnibal
7ac0f9626c Update rehearsal example 2019-02-24 16:17:41 +01:00
Ines Montani
6de81ae310 Fix formatting of errors 2019-02-24 15:11:28 +01:00
Ines Montani
d8f69d592f Tidy up retokenizer tests 2019-02-24 14:14:11 +01:00
Ines Montani
723e27cb8c Tidy up tests 2019-02-24 14:11:23 +01:00
Ines Montani
2982f82934 Auto-format 2019-02-24 14:09:15 +01:00
Ines Montani
09bf08b3c3 Update redirects [ci skip] 2019-02-24 13:37:50 +01:00
Ines Montani
dceca3264d Tidy up package.json [ci skip] 2019-02-24 13:37:41 +01:00
Ines Montani
3ef4da3503 Update and auto-format README [ci skip] 2019-02-24 13:12:13 +01:00
Ines Montani
46ec5cdccc Update TextCategorizer docs 2019-02-24 13:11:57 +01:00
Ines Montani
c03cb1cc63 Improve built-in component API docs 2019-02-24 13:11:49 +01:00
Ines Montani
235a0e948e Tidy up CI config 2019-02-24 12:07:33 +01:00
Ines Montani
b570a1e203 Exclude website branch from CI 2019-02-24 11:52:16 +01:00
Ines Montani
383e2e1f12 Update Python versions [ci skip] 2019-02-24 11:49:45 +01:00
Ines Montani
b624cb4b89 Update v2-1.md 2019-02-24 11:49:27 +01:00
Matthew Honnibal
909a9d9932 Set version to v2.1.0a9.dev1 2019-02-23 13:10:42 +01:00
Matthew Honnibal
55bb3cc482 Require thinc 7.0.2 2019-02-23 13:10:09 +01:00
Matthew Honnibal
981cb89194 Fix f-score calculation if zero 2019-02-23 12:45:41 +01:00
Matthew Honnibal
6b0008afc6 Clean up TextCategorizer slightly 2019-02-23 12:28:06 +01:00
Matthew Honnibal
d13b9373bf Improve initialization for mutually textcat 2019-02-23 12:27:45 +01:00
Matthew Honnibal
5063d999e5 Set architecture in textcat example 2019-02-23 11:57:59 +01:00
Matthew Honnibal
e9dd5943b9 Support exclusive_classes setting for textcat models 2019-02-23 11:57:16 +01:00
Matthew Honnibal
ce1e4eace2 Default to former TextCategorizer model
* Keep TextCategorizer default model same as v2.0
* Add option 'architecture' that allows "simple_cnn" to switch to
simpler model.
* Add option exclusive_classes, defaulting to False. If set to True,
the model treats classes as mutually exclusive, i.e. only one class can
be true per instance.
2019-02-23 11:55:16 +01:00
Matthew Honnibal
829c9091a4 Set version to v2.1.0a9.dev0 2019-02-21 17:13:34 +01:00
Matthew Honnibal
d396a69c7b More fixes for issue #3112 2019-02-21 17:12:23 +01:00
Ines Montani
80bdcb99c5 Fix escaping of HTML in displacy ENT (closes #2728) 2019-02-21 14:30:39 +01:00
Ines Montani
250e88ef55 Fix docs example (see #2728) 2019-02-21 14:22:06 +01:00
Ines Montani
0fc908d7a5 Add note on merging speed in v2.1 (see #3300) [ci skip] 2019-02-21 12:34:18 +01:00
Ines Montani
236aa94ded Update v2-1.md 2019-02-21 12:33:56 +01:00
Matthew Honnibal
7d529ebdfb Set version to v2.1.0a8 2019-02-21 12:09:34 +01:00
Matthew Honnibal
7cbdcaddf3 Ensure new setuptools before building sdist 2019-02-21 12:08:41 +01:00
Matthew Honnibal
f75be6e7be Set version to v2.1.0a8.dev1 2019-02-21 11:57:06 +01:00
Matthew Honnibal
c5f947f194 Fix regex deprecation warnings 2019-02-21 11:56:47 +01:00
Matthew Honnibal
7f02464494 Set version to v2.1.0a8.dev0 2019-02-21 11:42:23 +01:00
Matthew Honnibal
f31dbec528 More fixes for #3112 2019-02-21 11:10:10 +01:00
Matthew Honnibal
e485241003 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2019-02-21 10:33:35 +01:00
Matthew Honnibal
582be8746c Update multi_processing example 2019-02-21 10:33:16 +01:00
Matthew Honnibal
80195bc2d1
Fix issue #3288 (#3308) 2019-02-21 09:48:53 +01:00
Matthew Honnibal
a137e8b418 Fix Pipe.to_bytes() when model uninitialized
Closes #3289
2019-02-21 09:42:02 +01:00
Matthew Honnibal
6574e4f2d3 Fix issue #3112 part 1 2019-02-21 09:27:38 +01:00
Matthew Honnibal
b21481eeca Load token_match regex with .match, not .search 2019-02-21 09:09:03 +01:00
Sofie
9a478b6db8 Clean up of char classes, few tokenizer fixes and faster default French tokenizer (#3293)
* splitting up latin unicode interval

* removing hyphen as infix for French

* adding failing test for issue 1235

* test for issue #3002 which now works

* partial fix for issue #2070

* keep the hyphen as infix for French (as it was)

* restore french expressions with hyphen as infix (as it was)

* added succeeding unit test for Issue #2656

* Fix issue #2822 with custom Italian exception

* Fix issue #2926 by allowing numbers right before infix /

* splitting up latin unicode interval

* removing hyphen as infix for French

* adding failing test for issue 1235

* test for issue #3002 which now works

* partial fix for issue #2070

* keep the hyphen as infix for French (as it was)

* restore french expressions with hyphen as infix (as it was)

* added succeeding unit test for Issue #2656

* Fix issue #2822 with custom Italian exception

* Fix issue #2926 by allowing numbers right before infix /

* remove duplicate

* remove xfail for Issue #2179 fixed by Matt

* adjust documentation and remove reference to regex lib
2019-02-20 22:10:13 +01:00
Ines Montani
9696cf16c1 Merge branch 'master' into develop 2019-02-20 21:31:27 +01:00
Matthew Honnibal
0d1ca15b13 💫 Fix bugs in matcher extensions. Closes #1971 (#3301)
* Fix matching on extension attrs and predicates

* Fix detection of match_id when using extension attributes. The match
ID is stored as the last entry in the pattern. We were checking for this
with nr_attr == 0, which didn't account for extension attributes.

* Fix handling of predicates. The wrong count was being passed through,
so even patterns that didn't have a predicate were being checked.

* Fix regex pattern

* Fix matcher set value test
2019-02-20 21:30:39 +01:00
Ines Montani
f73d01aa32 Update netlify.toml [ci skip] 2019-02-20 14:33:32 +01:00
Ines Montani
da5edbe434 Tidy up 2019-02-20 14:33:23 +01:00