mirror of
https://github.com/explosion/spaCy.git
synced 2025-01-12 18:26:30 +03:00
Merge remote-tracking branch 'origin/master' into develop
This commit is contained in:
commit
9d147e12c4
106
.github/contributors/5hirish.md
vendored
Normal file
106
.github/contributors/5hirish.md
vendored
Normal file
|
@ -0,0 +1,106 @@
|
|||
# spaCy contributor agreement
|
||||
|
||||
This spaCy Contributor Agreement (**"SCA"**) is based on the
|
||||
[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf).
|
||||
The SCA applies to any contribution that you make to any product or project
|
||||
managed by us (the **"project"**), and sets out the intellectual property rights
|
||||
you grant to us in the contributed materials. The term **"us"** shall mean
|
||||
[ExplosionAI UG (haftungsbeschränkt)](https://explosion.ai/legal). The term
|
||||
**"you"** shall mean the person or entity identified below.
|
||||
|
||||
If you agree to be bound by these terms, fill in the information requested
|
||||
below and include the filled-in version with your first pull request, under the
|
||||
folder [`.github/contributors/`](/.github/contributors/). The name of the file
|
||||
should be your GitHub username, with the extension `.md`. For example, the user
|
||||
example_user would create the file `.github/contributors/example_user.md`.
|
||||
|
||||
Read this agreement carefully before signing. These terms and conditions
|
||||
constitute a binding legal agreement.
|
||||
|
||||
## Contributor Agreement
|
||||
|
||||
1. The term "contribution" or "contributed materials" means any source code,
|
||||
object code, patch, tool, sample, graphic, specification, manual,
|
||||
documentation, or any other material posted or submitted by you to the project.
|
||||
|
||||
2. With respect to any worldwide copyrights, or copyright applications and
|
||||
registrations, in your contribution:
|
||||
|
||||
* you hereby assign to us joint ownership, and to the extent that such
|
||||
assignment is or becomes invalid, ineffective or unenforceable, you hereby
|
||||
grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge,
|
||||
royalty-free, unrestricted license to exercise all rights under those
|
||||
copyrights. This includes, at our option, the right to sublicense these same
|
||||
rights to third parties through multiple levels of sublicensees or other
|
||||
licensing arrangements;
|
||||
|
||||
* you agree that each of us can do all things in relation to your
|
||||
contribution as if each of us were the sole owners, and if one of us makes
|
||||
a derivative work of your contribution, the one who makes the derivative
|
||||
work (or has it made will be the sole owner of that derivative work;
|
||||
|
||||
* you agree that you will not assert any moral rights in your contribution
|
||||
against us, our licensees or transferees;
|
||||
|
||||
* you agree that we may register a copyright in your contribution and
|
||||
exercise all ownership rights associated with it; and
|
||||
|
||||
* you agree that neither of us has any duty to consult with, obtain the
|
||||
consent of, pay or render an accounting to the other for any use or
|
||||
distribution of your contribution.
|
||||
|
||||
3. With respect to any patents you own, or that you can license without payment
|
||||
to any third party, you hereby grant to us a perpetual, irrevocable,
|
||||
non-exclusive, worldwide, no-charge, royalty-free license to:
|
||||
|
||||
* make, have made, use, sell, offer to sell, import, and otherwise transfer
|
||||
your contribution in whole or in part, alone or in combination with or
|
||||
included in any product, work or materials arising out of the project to
|
||||
which your contribution was submitted, and
|
||||
|
||||
* at our option, to sublicense these same rights to third parties through
|
||||
multiple levels of sublicensees or other licensing arrangements.
|
||||
|
||||
4. Except as set out above, you keep all right, title, and interest in your
|
||||
contribution. The rights that you grant to us under these terms are effective
|
||||
on the date you first submitted a contribution to us, even if your submission
|
||||
took place before the date you sign these terms.
|
||||
|
||||
5. You covenant, represent, warrant and agree that:
|
||||
|
||||
* Each contribution that you submit is and shall be an original work of
|
||||
authorship and you can legally grant the rights set out in this SCA;
|
||||
|
||||
* to the best of your knowledge, each contribution will not violate any
|
||||
third party's copyrights, trademarks, patents, or other intellectual
|
||||
property rights; and
|
||||
|
||||
* each contribution shall be in compliance with U.S. export control laws and
|
||||
other applicable export and import laws. You agree to notify us if you
|
||||
become aware of any circumstance which would make any of the foregoing
|
||||
representations inaccurate in any respect. We may publicly disclose your
|
||||
participation in the project, including the fact that you have signed the SCA.
|
||||
|
||||
6. This SCA is governed by the laws of the State of California and applicable
|
||||
U.S. Federal law. Any choice of law rules will not apply.
|
||||
|
||||
7. Please place an “x” on one of the applicable statement below. Please do NOT
|
||||
mark both statements:
|
||||
|
||||
* [x] I am signing on behalf of myself as an individual and no other person
|
||||
or entity, including my employer, has or will have rights with respect to my
|
||||
contributions.
|
||||
|
||||
* [ ] I am signing on behalf of my employer or a legal entity and I have the
|
||||
actual authority to contractually bind that entity.
|
||||
|
||||
## Contributor Details
|
||||
|
||||
| Field | Entry |
|
||||
|------------------------------- | ------------------------ |
|
||||
| Name | Shirish Kadam |
|
||||
| Company name (if applicable) | SlicePay |
|
||||
| Title or role (if applicable) | Android Developer |
|
||||
| Date | 2017-11-13 |
|
||||
| GitHub username | 5hirish |
|
||||
| Website (optional) | https://shirishkadam.com |
|
106
.github/contributors/therealronnie.md
vendored
Normal file
106
.github/contributors/therealronnie.md
vendored
Normal file
|
@ -0,0 +1,106 @@
|
|||
# spaCy contributor agreement
|
||||
|
||||
This spaCy Contributor Agreement (**"SCA"**) is based on the
|
||||
[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf).
|
||||
The SCA applies to any contribution that you make to any product or project
|
||||
managed by us (the **"project"**), and sets out the intellectual property rights
|
||||
you grant to us in the contributed materials. The term **"us"** shall mean
|
||||
[ExplosionAI UG (haftungsbeschränkt)](https://explosion.ai/legal). The term
|
||||
**"you"** shall mean the person or entity identified below.
|
||||
|
||||
If you agree to be bound by these terms, fill in the information requested
|
||||
below and include the filled-in version with your first pull request, under the
|
||||
folder [`.github/contributors/`](/.github/contributors/). The name of the file
|
||||
should be your GitHub username, with the extension `.md`. For example, the user
|
||||
example_user would create the file `.github/contributors/example_user.md`.
|
||||
|
||||
Read this agreement carefully before signing. These terms and conditions
|
||||
constitute a binding legal agreement.
|
||||
|
||||
## Contributor Agreement
|
||||
|
||||
1. The term "contribution" or "contributed materials" means any source code,
|
||||
object code, patch, tool, sample, graphic, specification, manual,
|
||||
documentation, or any other material posted or submitted by you to the project.
|
||||
|
||||
2. With respect to any worldwide copyrights, or copyright applications and
|
||||
registrations, in your contribution:
|
||||
|
||||
* you hereby assign to us joint ownership, and to the extent that such
|
||||
assignment is or becomes invalid, ineffective or unenforceable, you hereby
|
||||
grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge,
|
||||
royalty-free, unrestricted license to exercise all rights under those
|
||||
copyrights. This includes, at our option, the right to sublicense these same
|
||||
rights to third parties through multiple levels of sublicensees or other
|
||||
licensing arrangements;
|
||||
|
||||
* you agree that each of us can do all things in relation to your
|
||||
contribution as if each of us were the sole owners, and if one of us makes
|
||||
a derivative work of your contribution, the one who makes the derivative
|
||||
work (or has it made will be the sole owner of that derivative work;
|
||||
|
||||
* you agree that you will not assert any moral rights in your contribution
|
||||
against us, our licensees or transferees;
|
||||
|
||||
* you agree that we may register a copyright in your contribution and
|
||||
exercise all ownership rights associated with it; and
|
||||
|
||||
* you agree that neither of us has any duty to consult with, obtain the
|
||||
consent of, pay or render an accounting to the other for any use or
|
||||
distribution of your contribution.
|
||||
|
||||
3. With respect to any patents you own, or that you can license without payment
|
||||
to any third party, you hereby grant to us a perpetual, irrevocable,
|
||||
non-exclusive, worldwide, no-charge, royalty-free license to:
|
||||
|
||||
* make, have made, use, sell, offer to sell, import, and otherwise transfer
|
||||
your contribution in whole or in part, alone or in combination with or
|
||||
included in any product, work or materials arising out of the project to
|
||||
which your contribution was submitted, and
|
||||
|
||||
* at our option, to sublicense these same rights to third parties through
|
||||
multiple levels of sublicensees or other licensing arrangements.
|
||||
|
||||
4. Except as set out above, you keep all right, title, and interest in your
|
||||
contribution. The rights that you grant to us under these terms are effective
|
||||
on the date you first submitted a contribution to us, even if your submission
|
||||
took place before the date you sign these terms.
|
||||
|
||||
5. You covenant, represent, warrant and agree that:
|
||||
|
||||
* Each contribution that you submit is and shall be an original work of
|
||||
authorship and you can legally grant the rights set out in this SCA;
|
||||
|
||||
* to the best of your knowledge, each contribution will not violate any
|
||||
third party's copyrights, trademarks, patents, or other intellectual
|
||||
property rights; and
|
||||
|
||||
* each contribution shall be in compliance with U.S. export control laws and
|
||||
other applicable export and import laws. You agree to notify us if you
|
||||
become aware of any circumstance which would make any of the foregoing
|
||||
representations inaccurate in any respect. We may publicly disclose your
|
||||
participation in the project, including the fact that you have signed the SCA.
|
||||
|
||||
6. This SCA is governed by the laws of the State of California and applicable
|
||||
U.S. Federal law. Any choice of law rules will not apply.
|
||||
|
||||
7. Please place an “x” on one of the applicable statement below. Please do NOT
|
||||
mark both statements:
|
||||
|
||||
* [x] I am signing on behalf of myself as an individual and no other person
|
||||
or entity, including my employer, has or will have rights with respect to my
|
||||
contributions.
|
||||
|
||||
* [ ] I am signing on behalf of my employer or a legal entity and I have the
|
||||
actual authority to contractually bind that entity.
|
||||
|
||||
## Contributor Details
|
||||
|
||||
| Field | Entry |
|
||||
|------------------------------- | -------------------- |
|
||||
| Name | Ronnie Gonzalez |
|
||||
| Company name (if applicable) | |
|
||||
| Title or role (if applicable) | |
|
||||
| Date | 17.04.2018 |
|
||||
| GitHub username | therealronnie |
|
||||
| Website (optional) | |
|
|
@ -108,6 +108,18 @@ def test_doc_api_serialize(en_tokenizer, text):
|
|||
assert [t.text for t in tokens] == [t.text for t in new_tokens]
|
||||
assert [t.orth for t in tokens] == [t.orth for t in new_tokens]
|
||||
|
||||
new_tokens = get_doc(tokens.vocab).from_bytes(
|
||||
tokens.to_bytes(tensor=False), tensor=False)
|
||||
assert tokens.text == new_tokens.text
|
||||
assert [t.text for t in tokens] == [t.text for t in new_tokens]
|
||||
assert [t.orth for t in tokens] == [t.orth for t in new_tokens]
|
||||
|
||||
new_tokens = get_doc(tokens.vocab).from_bytes(
|
||||
tokens.to_bytes(sentiment=False), sentiment=False)
|
||||
assert tokens.text == new_tokens.text
|
||||
assert [t.text for t in tokens] == [t.text for t in new_tokens]
|
||||
assert [t.orth for t in tokens] == [t.orth for t in new_tokens]
|
||||
|
||||
|
||||
def test_doc_api_set_ents(en_tokenizer):
|
||||
text = "I use goggle chrone to surf the web"
|
||||
|
|
|
@ -835,7 +835,10 @@ cdef class Doc:
|
|||
|
||||
cdef attr_t[:, :] attrs
|
||||
cdef int i, start, end, has_space
|
||||
|
||||
if 'sentiment' not in exclude and 'sentiment' in msg:
|
||||
self.sentiment = msg['sentiment']
|
||||
if 'tensor' not in exclude and 'tensor' in msg:
|
||||
self.tensor = msg['tensor']
|
||||
|
||||
start = 0
|
||||
|
|
|
@ -86,8 +86,8 @@
|
|||
}
|
||||
],
|
||||
|
||||
"V_CSS": "2.1.2",
|
||||
"V_JS": "2.1.0",
|
||||
"V_CSS": "2.1.3",
|
||||
"V_JS": "2.1.1",
|
||||
"DEFAULT_SYNTAX": "python",
|
||||
"ANALYTICS": "UA-58931649-1",
|
||||
"MAILCHIMP": {
|
||||
|
|
|
@ -260,8 +260,8 @@ mixin code(label, language, prompt, height, icon, wrap)
|
|||
mixin code-exec(label, large)
|
||||
- label = (label || "Editable code example") + " (experimental)"
|
||||
+terminal-wrapper(label, !large)
|
||||
figure.thebelab-wrapper
|
||||
span.thebelab-wrapper__text.u-text-tiny v#{BINDER_VERSION} · Python 3 · via #[+a("https://mybinder.org/").u-hide-link Binder]
|
||||
figure.juniper-wrapper
|
||||
span.juniper-wrapper__text.u-text-tiny v#{BINDER_VERSION} · Python 3 · via #[+a("https://mybinder.org/").u-hide-link Binder]
|
||||
+code(data-executable="true")&attributes(attributes)
|
||||
block
|
||||
|
||||
|
|
|
@ -1,15 +1,10 @@
|
|||
//- 💫 INCLUDES > SCRIPTS
|
||||
|
||||
if IS_PAGE || SECTION == "index"
|
||||
script(type="text/x-thebe-config")
|
||||
| { bootstrap: true, binderOptions: { repo: "#{KERNEL_BINDER}"},
|
||||
| kernelOptions: { name: "#{KERNEL_PYTHON}" }}
|
||||
|
||||
- scripts = ["vendor/prism.min", "vendor/vue.min"]
|
||||
- if (SECTION == "universe") scripts.push("vendor/vue-markdown.min")
|
||||
- if (quickstart) scripts.push("vendor/quickstart.min")
|
||||
- if (IS_PAGE) scripts.push("vendor/in-view.min")
|
||||
- if (IS_PAGE || SECTION == "index") scripts.push("vendor/thebelab.custom.min")
|
||||
- if (IS_PAGE || SECTION == "index") scripts.push("vendor/juniper.min")
|
||||
|
||||
for script in scripts
|
||||
script(src="/assets/js/" + script + ".js")
|
||||
|
|
|
@ -3,7 +3,7 @@
|
|||
//- Code block
|
||||
|
||||
.c-code-block,
|
||||
.thebelab-cell
|
||||
.juniper-cell
|
||||
background: $color-front
|
||||
color: darken($color-back, 20)
|
||||
padding: 0.75em 0
|
||||
|
@ -32,7 +32,7 @@
|
|||
//- Code block content
|
||||
|
||||
.c-code-block__content,
|
||||
.thebelab-input,
|
||||
.juniper-input,
|
||||
.jp-OutputArea
|
||||
display: block
|
||||
font: normal normal 1.1rem/#{1.9} $font-code
|
||||
|
@ -45,13 +45,21 @@
|
|||
vertical-align: middle
|
||||
opacity: 0.5
|
||||
|
||||
//- Thebelab
|
||||
//- Juniper
|
||||
|
||||
[data-executable]
|
||||
margin-bottom: 0
|
||||
|
||||
.thebelab-input.thebelab-input
|
||||
padding: 3em 2em 1em
|
||||
.juniper-cell
|
||||
border: 0
|
||||
|
||||
.juniper-input
|
||||
padding: 0
|
||||
|
||||
.juniper-output
|
||||
color: inherit
|
||||
background: inherit
|
||||
padding: 0
|
||||
|
||||
.jp-OutputArea
|
||||
&:not(:empty)
|
||||
|
@ -75,13 +83,14 @@
|
|||
font-family: inherit
|
||||
font-weight: bold
|
||||
|
||||
.thebelab-run-button
|
||||
.juniper-button
|
||||
@extend .u-text-label, .u-text-label--dark
|
||||
position: static
|
||||
|
||||
.thebelab-wrapper
|
||||
.juniper-wrapper
|
||||
position: relative
|
||||
|
||||
.thebelab-wrapper__text
|
||||
.juniper-wrapper__text
|
||||
@include position(absolute, top, right, 1.25rem, 1.25rem)
|
||||
color: $color-subtle-dark
|
||||
z-index: 10
|
||||
|
|
|
@ -36,21 +36,19 @@ import initUniverse from './universe.vue.js';
|
|||
/**
|
||||
* Initialise Quickstart
|
||||
*/
|
||||
if (document.querySelector('#qs') && window.Quickstart) {
|
||||
{
|
||||
if (document.querySelector('#qs') && window.Quickstart) {
|
||||
new Quickstart('#qs');
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Thebelabs
|
||||
* Initialise Juniper
|
||||
*/
|
||||
if (window.thebelab) {
|
||||
window.thebelab.on('status', (ev, data) => {
|
||||
if (data.status == 'failed') {
|
||||
const msg = "Failed to connect to kernel :( This can happen if too many users are active at the same time. Please reload the page and try again!";
|
||||
const wrapper = `<span style="white-space: pre-wrap">${msg}</span>`;
|
||||
document.querySelector('.jp-OutputArea-output pre').innerHTML = wrapper;
|
||||
{
|
||||
if (window.Juniper) {
|
||||
new Juniper({ repo: 'ines/spacy-binder' });
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
|
|
|
@ -116,7 +116,8 @@ export default function(selector, dataPath) {
|
|||
$_updateUrl(params) {
|
||||
const loc = Object.keys(params)
|
||||
.map(param => `${param}=${encodeURIComponent(params[param])}`);
|
||||
const url = loc.length ? '?' + loc.join('&') : window.location.origin + window.location.pathname;
|
||||
const url = loc.length ? '?' + loc.join('&')
|
||||
: window.location.origin + window.location.pathname;
|
||||
window.history.pushState(params, null, url);
|
||||
}
|
||||
}
|
||||
|
|
9
website/assets/js/vendor/juniper.min.js
vendored
Normal file
9
website/assets/js/vendor/juniper.min.js
vendored
Normal file
File diff suppressed because one or more lines are too long
26
website/assets/js/vendor/thebelab.custom.min.js
vendored
26
website/assets/js/vendor/thebelab.custom.min.js
vendored
File diff suppressed because one or more lines are too long
|
@ -826,7 +826,31 @@
|
|||
"thumb": "https://i.imgur.com/9MIgMAc.jpg",
|
||||
"author": "Aaron Kramer",
|
||||
"category": ["courses"]
|
||||
},
|
||||
{
|
||||
"id": "adam_qas",
|
||||
"title": "ADAM: Question Answering System",
|
||||
"slogan": "A question answering system that extracts answers from Wikipedia to questions posed in natural language.",
|
||||
"github": "5hirish/adam_qas",
|
||||
"pip": "qas",
|
||||
"code_example": [
|
||||
"git clone https://github.com/5hirish/adam_qas.git",
|
||||
"cd adam_qas",
|
||||
"pip install -r requirements.txt",
|
||||
"python -m qas.adam 'When was linux kernel version 4.0 released ?'"
|
||||
],
|
||||
"code_language": "bash",
|
||||
"thumb": "https://shirishkadam.files.wordpress.com/2018/04/mini_alleviate.png",
|
||||
"author": "Shirish Kadam",
|
||||
"author_links": {
|
||||
"twitter": "5hirish",
|
||||
"github": "5hirish",
|
||||
"website": "https://shirishkadam.com/"
|
||||
},
|
||||
"category": ["standalone"],
|
||||
"tags": [ "question-answering", "elasticsearch"]
|
||||
}
|
||||
|
||||
],
|
||||
"projectCats": {
|
||||
"pipeline": {
|
||||
|
|
|
@ -276,7 +276,7 @@ p
|
|||
nlp = spacy.load('en_core_web_sm')
|
||||
matcher = Matcher(nlp.vocab)
|
||||
# register a new token extension to flag bad HTML
|
||||
Token.set_extension('bad_html', default=False, force=True)
|
||||
Token.set_extension('bad_html', default=False)
|
||||
|
||||
def merge_and_flag(matcher, doc, i, matches):
|
||||
match_id, start, end = matches[i]
|
||||
|
@ -650,7 +650,7 @@ p
|
|||
matcher.add('HASHTAG', None, [{'ORTH': '#'}, {'IS_ASCII': True}])
|
||||
|
||||
# register token extension
|
||||
Token.set_extension('is_hashtag', default=False, force=True)
|
||||
Token.set_extension('is_hashtag', default=False)
|
||||
|
||||
doc = nlp(u"Hello world 😀 #MondayMotivation")
|
||||
matches = matcher(doc)
|
||||
|
|
Loading…
Reference in New Issue
Block a user