Commit Graph

11081 Commits

Author SHA1 Message Date
Matthew Honnibal
d13b9373bf Improve initialization for mutually textcat 2019-02-23 12:27:45 +01:00
Matthew Honnibal
5063d999e5 Set architecture in textcat example 2019-02-23 11:57:59 +01:00
Matthew Honnibal
e9dd5943b9 Support exclusive_classes setting for textcat models 2019-02-23 11:57:16 +01:00
Matthew Honnibal
ce1e4eace2 Default to former TextCategorizer model
* Keep TextCategorizer default model same as v2.0
* Add option 'architecture' that allows "simple_cnn" to switch to
simpler model.
* Add option exclusive_classes, defaulting to False. If set to True,
the model treats classes as mutually exclusive, i.e. only one class can
be true per instance.
2019-02-23 11:55:16 +01:00
Matthew Honnibal
829c9091a4 Set version to v2.1.0a9.dev0 2019-02-21 17:13:34 +01:00
Matthew Honnibal
d396a69c7b More fixes for issue #3112 2019-02-21 17:12:23 +01:00
Ines Montani
80bdcb99c5 Fix escaping of HTML in displacy ENT (closes #2728) 2019-02-21 14:30:39 +01:00
Ines Montani
250e88ef55 Fix docs example (see #2728) 2019-02-21 14:22:06 +01:00
Ines Montani
ab8392eda3 Merge branch 'develop' into spacy.io 2019-02-21 12:34:51 +01:00
Ines Montani
0fc908d7a5 Add note on merging speed in v2.1 (see #3300) [ci skip] 2019-02-21 12:34:18 +01:00
Ines Montani
236aa94ded Update v2-1.md 2019-02-21 12:33:56 +01:00
Matthew Honnibal
7d529ebdfb Set version to v2.1.0a8 2019-02-21 12:09:34 +01:00
Matthew Honnibal
7cbdcaddf3 Ensure new setuptools before building sdist 2019-02-21 12:08:41 +01:00
Matthew Honnibal
f75be6e7be Set version to v2.1.0a8.dev1 2019-02-21 11:57:06 +01:00
Matthew Honnibal
c5f947f194 Fix regex deprecation warnings 2019-02-21 11:56:47 +01:00
Matthew Honnibal
7f02464494 Set version to v2.1.0a8.dev0 2019-02-21 11:42:23 +01:00
Matthew Honnibal
f31dbec528 More fixes for #3112 2019-02-21 11:10:10 +01:00
Matthew Honnibal
e485241003 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2019-02-21 10:33:35 +01:00
Matthew Honnibal
582be8746c Update multi_processing example 2019-02-21 10:33:16 +01:00
Matthew Honnibal
80195bc2d1
Fix issue #3288 (#3308) 2019-02-21 09:48:53 +01:00
Matthew Honnibal
a137e8b418 Fix Pipe.to_bytes() when model uninitialized
Closes #3289
2019-02-21 09:42:02 +01:00
Matthew Honnibal
6574e4f2d3 Fix issue #3112 part 1 2019-02-21 09:27:38 +01:00
Matthew Honnibal
b21481eeca Load token_match regex with .match, not .search 2019-02-21 09:09:03 +01:00
Sofie
9a478b6db8 Clean up of char classes, few tokenizer fixes and faster default French tokenizer (#3293)
* splitting up latin unicode interval

* removing hyphen as infix for French

* adding failing test for issue 1235

* test for issue #3002 which now works

* partial fix for issue #2070

* keep the hyphen as infix for French (as it was)

* restore french expressions with hyphen as infix (as it was)

* added succeeding unit test for Issue #2656

* Fix issue #2822 with custom Italian exception

* Fix issue #2926 by allowing numbers right before infix /

* splitting up latin unicode interval

* removing hyphen as infix for French

* adding failing test for issue 1235

* test for issue #3002 which now works

* partial fix for issue #2070

* keep the hyphen as infix for French (as it was)

* restore french expressions with hyphen as infix (as it was)

* added succeeding unit test for Issue #2656

* Fix issue #2822 with custom Italian exception

* Fix issue #2926 by allowing numbers right before infix /

* remove duplicate

* remove xfail for Issue #2179 fixed by Matt

* adjust documentation and remove reference to regex lib
2019-02-20 22:10:13 +01:00
Ines Montani
9696cf16c1 Merge branch 'master' into develop 2019-02-20 21:31:27 +01:00
Matthew Honnibal
0d1ca15b13 💫 Fix bugs in matcher extensions. Closes #1971 (#3301)
* Fix matching on extension attrs and predicates

* Fix detection of match_id when using extension attributes. The match
ID is stored as the last entry in the pattern. We were checking for this
with nr_attr == 0, which didn't account for extension attributes.

* Fix handling of predicates. The wrong count was being passed through,
so even patterns that didn't have a predicate were being checked.

* Fix regex pattern

* Fix matcher set value test
2019-02-20 21:30:39 +01:00
Ines Montani
f73d01aa32 Update netlify.toml [ci skip] 2019-02-20 14:33:32 +01:00
Ines Montani
da5edbe434 Tidy up 2019-02-20 14:33:23 +01:00
Michael Liberman
386cec1979 - Json fix in comment (#3294) 2019-02-19 18:01:35 +01:00
Ines Montani
417e86a77f Merge branch 'develop' into spacy.io 2019-02-18 21:50:16 +01:00
Ines Montani
3b667787a9 Add xfailing test for #3289 2019-02-18 16:45:04 +01:00
Ines Montani
57ae71ea95 Add docs on serializing the pipeline (see #3289) [ci skip] 2019-02-18 14:13:29 +01:00
Ines Montani
91f260f2c4 Add another test for #1971 2019-02-18 13:36:20 +01:00
Ines Montani
f30aac324c Update test_issue1971.py 2019-02-18 13:36:15 +01:00
Ines Montani
38e4422c0d Improve matcher example (resolves #3287) 2019-02-18 13:26:37 +01:00
Ines Montani
660cfe44c5 Fix formatting 2019-02-18 13:26:22 +01:00
Ines Montani
8fa26ca97e Fix tensor shape in test for #3288 2019-02-18 11:01:54 +01:00
Ines Montani
c32290557f Add xfailing test for #3288 2019-02-18 10:59:31 +01:00
Ines Montani
c5476bd75b Update languages.json 2019-02-18 10:03:35 +01:00
Ines Montani
3fdcdec6a0 Merge branch 'master' into develop 2019-02-18 10:03:32 +01:00
Roshni Biswas
e09f1347fa updates for Bengali language (#3286)
* Update morph_rules.py

* contributor agreement for roshni-b

* created example sentences
2019-02-18 10:02:28 +01:00
Ines Montani
d47bbbe2a1 Update layout.sass 2019-02-17 22:27:45 +01:00
Ines Montani
212ff359ef Fix links [ci skip] 2019-02-17 22:25:50 +01:00
Ines Montani
04b4df0ec9 Remove n_threads 2019-02-17 22:25:42 +01:00
Ines Montani
4c7ab7620a Update README.md 2019-02-17 22:16:17 +01:00
Ines Montani
8a8523d8c1 Update README.md 2019-02-17 21:59:52 +01:00
Ines Montani
e597110d31
💫 Update website (#3285)
<!--- Provide a general summary of your changes in the title. -->

## Description

The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in **straightforward Markdown** without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on.

This PR also includes various new docs pages and content.
Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837.


### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2019-02-17 19:31:19 +01:00
Ines Montani
043e8186f3 Merge branch 'master' into develop 2019-02-17 17:51:17 +01:00
Marc Puig
51268e9f21 Typo error fixed (#3284) 2019-02-17 17:51:02 +01:00
Ines Montani
3af0b2dd1c Add xfailing test for #1971 [ci skip] 2019-02-17 13:04:47 +01:00