Ines Montani
403b9cd58b
Add docs on adding to existing tokenizer rules [ci skip]
2019-02-24 18:35:19 +01:00
Ines Montani
1ea1bc98e7
Document regex utilities [ci skip]
2019-02-24 18:34:10 +01:00
Ines Montani
cd4bc6757b
Update README.md [ci skip]
2019-02-24 17:40:01 +01:00
Matthew Honnibal
1f7c56cd93
Fix parser.add_label()
2019-02-24 16:53:22 +01:00
Matthew Honnibal
893aa40d73
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2019-02-24 16:43:01 +01:00
Matthew Honnibal
5882d82915
Set version to v2.1.0a9.dev2
2019-02-24 16:42:06 +01:00
Matthew Honnibal
0367f864fe
Fix handling of added labels. Resolves #3189
2019-02-24 16:41:41 +01:00
Matthew Honnibal
4dc57d9e15
Update train_new_entity_type example
2019-02-24 16:41:03 +01:00
Matthew Honnibal
d74dbde828
Fix order of actions when labels added to parser
...
When labels were added to the parser or NER, we weren't loading back the
classes in the correct order. Re issue #3189
2019-02-24 16:36:29 +01:00
Matthew Honnibal
7ac0f9626c
Update rehearsal example
2019-02-24 16:17:41 +01:00
Ines Montani
6de81ae310
Fix formatting of errors
2019-02-24 15:11:28 +01:00
Ines Montani
d8f69d592f
Tidy up retokenizer tests
2019-02-24 14:14:11 +01:00
Ines Montani
723e27cb8c
Tidy up tests
2019-02-24 14:11:23 +01:00
Ines Montani
2982f82934
Auto-format
2019-02-24 14:09:15 +01:00
Ines Montani
09bf08b3c3
Update redirects [ci skip]
2019-02-24 13:37:50 +01:00
Ines Montani
dceca3264d
Tidy up package.json [ci skip]
2019-02-24 13:37:41 +01:00
Ines Montani
3ef4da3503
Update and auto-format README [ci skip]
2019-02-24 13:12:13 +01:00
Ines Montani
46ec5cdccc
Update TextCategorizer docs
2019-02-24 13:11:57 +01:00
Ines Montani
c03cb1cc63
Improve built-in component API docs
2019-02-24 13:11:49 +01:00
Ines Montani
235a0e948e
Tidy up CI config
2019-02-24 12:07:33 +01:00
Ines Montani
b570a1e203
Exclude website branch from CI
2019-02-24 11:52:16 +01:00
Ines Montani
383e2e1f12
Update Python versions [ci skip]
2019-02-24 11:49:45 +01:00
Ines Montani
b624cb4b89
Update v2-1.md
2019-02-24 11:49:27 +01:00
Matthew Honnibal
909a9d9932
Set version to v2.1.0a9.dev1
2019-02-23 13:10:42 +01:00
Matthew Honnibal
55bb3cc482
Require thinc 7.0.2
2019-02-23 13:10:09 +01:00
Matthew Honnibal
981cb89194
Fix f-score calculation if zero
2019-02-23 12:45:41 +01:00
Matthew Honnibal
6b0008afc6
Clean up TextCategorizer slightly
2019-02-23 12:28:06 +01:00
Matthew Honnibal
d13b9373bf
Improve initialization for mutually textcat
2019-02-23 12:27:45 +01:00
Matthew Honnibal
5063d999e5
Set architecture in textcat example
2019-02-23 11:57:59 +01:00
Matthew Honnibal
e9dd5943b9
Support exclusive_classes setting for textcat models
2019-02-23 11:57:16 +01:00
Matthew Honnibal
ce1e4eace2
Default to former TextCategorizer model
...
* Keep TextCategorizer default model same as v2.0
* Add option 'architecture' that allows "simple_cnn" to switch to
simpler model.
* Add option exclusive_classes, defaulting to False. If set to True,
the model treats classes as mutually exclusive, i.e. only one class can
be true per instance.
2019-02-23 11:55:16 +01:00
Matthew Honnibal
829c9091a4
Set version to v2.1.0a9.dev0
2019-02-21 17:13:34 +01:00
Matthew Honnibal
d396a69c7b
More fixes for issue #3112
2019-02-21 17:12:23 +01:00
Ines Montani
80bdcb99c5
Fix escaping of HTML in displacy ENT ( closes #2728 )
2019-02-21 14:30:39 +01:00
Ines Montani
250e88ef55
Fix docs example (see #2728 )
2019-02-21 14:22:06 +01:00
Ines Montani
0fc908d7a5
Add note on merging speed in v2.1 (see #3300 ) [ci skip]
2019-02-21 12:34:18 +01:00
Ines Montani
236aa94ded
Update v2-1.md
2019-02-21 12:33:56 +01:00
Matthew Honnibal
7d529ebdfb
Set version to v2.1.0a8
2019-02-21 12:09:34 +01:00
Matthew Honnibal
7cbdcaddf3
Ensure new setuptools before building sdist
2019-02-21 12:08:41 +01:00
Matthew Honnibal
f75be6e7be
Set version to v2.1.0a8.dev1
2019-02-21 11:57:06 +01:00
Matthew Honnibal
c5f947f194
Fix regex deprecation warnings
2019-02-21 11:56:47 +01:00
Matthew Honnibal
7f02464494
Set version to v2.1.0a8.dev0
2019-02-21 11:42:23 +01:00
Matthew Honnibal
f31dbec528
More fixes for #3112
2019-02-21 11:10:10 +01:00
Matthew Honnibal
e485241003
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2019-02-21 10:33:35 +01:00
Matthew Honnibal
582be8746c
Update multi_processing example
2019-02-21 10:33:16 +01:00
Matthew Honnibal
80195bc2d1
Fix issue #3288 ( #3308 )
2019-02-21 09:48:53 +01:00
Matthew Honnibal
a137e8b418
Fix Pipe.to_bytes() when model uninitialized
...
Closes #3289
2019-02-21 09:42:02 +01:00
Matthew Honnibal
6574e4f2d3
Fix issue #3112 part 1
2019-02-21 09:27:38 +01:00
Matthew Honnibal
b21481eeca
Load token_match regex with .match, not .search
2019-02-21 09:09:03 +01:00
Sofie
9a478b6db8
Clean up of char classes, few tokenizer fixes and faster default French tokenizer ( #3293 )
...
* splitting up latin unicode interval
* removing hyphen as infix for French
* adding failing test for issue 1235
* test for issue #3002 which now works
* partial fix for issue #2070
* keep the hyphen as infix for French (as it was)
* restore french expressions with hyphen as infix (as it was)
* added succeeding unit test for Issue #2656
* Fix issue #2822 with custom Italian exception
* Fix issue #2926 by allowing numbers right before infix /
* splitting up latin unicode interval
* removing hyphen as infix for French
* adding failing test for issue 1235
* test for issue #3002 which now works
* partial fix for issue #2070
* keep the hyphen as infix for French (as it was)
* restore french expressions with hyphen as infix (as it was)
* added succeeding unit test for Issue #2656
* Fix issue #2822 with custom Italian exception
* Fix issue #2926 by allowing numbers right before infix /
* remove duplicate
* remove xfail for Issue #2179 fixed by Matt
* adjust documentation and remove reference to regex lib
2019-02-20 22:10:13 +01:00