mirror of
https://github.com/explosion/spaCy.git
synced 2025-01-26 17:24:41 +03:00
Merge remote-tracking branch 'origin/master' into develop
This commit is contained in:
commit
9d147e12c4
106
.github/contributors/5hirish.md
vendored
Normal file
106
.github/contributors/5hirish.md
vendored
Normal file
|
@ -0,0 +1,106 @@
|
||||||
|
# spaCy contributor agreement
|
||||||
|
|
||||||
|
This spaCy Contributor Agreement (**"SCA"**) is based on the
|
||||||
|
[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf).
|
||||||
|
The SCA applies to any contribution that you make to any product or project
|
||||||
|
managed by us (the **"project"**), and sets out the intellectual property rights
|
||||||
|
you grant to us in the contributed materials. The term **"us"** shall mean
|
||||||
|
[ExplosionAI UG (haftungsbeschränkt)](https://explosion.ai/legal). The term
|
||||||
|
**"you"** shall mean the person or entity identified below.
|
||||||
|
|
||||||
|
If you agree to be bound by these terms, fill in the information requested
|
||||||
|
below and include the filled-in version with your first pull request, under the
|
||||||
|
folder [`.github/contributors/`](/.github/contributors/). The name of the file
|
||||||
|
should be your GitHub username, with the extension `.md`. For example, the user
|
||||||
|
example_user would create the file `.github/contributors/example_user.md`.
|
||||||
|
|
||||||
|
Read this agreement carefully before signing. These terms and conditions
|
||||||
|
constitute a binding legal agreement.
|
||||||
|
|
||||||
|
## Contributor Agreement
|
||||||
|
|
||||||
|
1. The term "contribution" or "contributed materials" means any source code,
|
||||||
|
object code, patch, tool, sample, graphic, specification, manual,
|
||||||
|
documentation, or any other material posted or submitted by you to the project.
|
||||||
|
|
||||||
|
2. With respect to any worldwide copyrights, or copyright applications and
|
||||||
|
registrations, in your contribution:
|
||||||
|
|
||||||
|
* you hereby assign to us joint ownership, and to the extent that such
|
||||||
|
assignment is or becomes invalid, ineffective or unenforceable, you hereby
|
||||||
|
grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge,
|
||||||
|
royalty-free, unrestricted license to exercise all rights under those
|
||||||
|
copyrights. This includes, at our option, the right to sublicense these same
|
||||||
|
rights to third parties through multiple levels of sublicensees or other
|
||||||
|
licensing arrangements;
|
||||||
|
|
||||||
|
* you agree that each of us can do all things in relation to your
|
||||||
|
contribution as if each of us were the sole owners, and if one of us makes
|
||||||
|
a derivative work of your contribution, the one who makes the derivative
|
||||||
|
work (or has it made will be the sole owner of that derivative work;
|
||||||
|
|
||||||
|
* you agree that you will not assert any moral rights in your contribution
|
||||||
|
against us, our licensees or transferees;
|
||||||
|
|
||||||
|
* you agree that we may register a copyright in your contribution and
|
||||||
|
exercise all ownership rights associated with it; and
|
||||||
|
|
||||||
|
* you agree that neither of us has any duty to consult with, obtain the
|
||||||
|
consent of, pay or render an accounting to the other for any use or
|
||||||
|
distribution of your contribution.
|
||||||
|
|
||||||
|
3. With respect to any patents you own, or that you can license without payment
|
||||||
|
to any third party, you hereby grant to us a perpetual, irrevocable,
|
||||||
|
non-exclusive, worldwide, no-charge, royalty-free license to:
|
||||||
|
|
||||||
|
* make, have made, use, sell, offer to sell, import, and otherwise transfer
|
||||||
|
your contribution in whole or in part, alone or in combination with or
|
||||||
|
included in any product, work or materials arising out of the project to
|
||||||
|
which your contribution was submitted, and
|
||||||
|
|
||||||
|
* at our option, to sublicense these same rights to third parties through
|
||||||
|
multiple levels of sublicensees or other licensing arrangements.
|
||||||
|
|
||||||
|
4. Except as set out above, you keep all right, title, and interest in your
|
||||||
|
contribution. The rights that you grant to us under these terms are effective
|
||||||
|
on the date you first submitted a contribution to us, even if your submission
|
||||||
|
took place before the date you sign these terms.
|
||||||
|
|
||||||
|
5. You covenant, represent, warrant and agree that:
|
||||||
|
|
||||||
|
* Each contribution that you submit is and shall be an original work of
|
||||||
|
authorship and you can legally grant the rights set out in this SCA;
|
||||||
|
|
||||||
|
* to the best of your knowledge, each contribution will not violate any
|
||||||
|
third party's copyrights, trademarks, patents, or other intellectual
|
||||||
|
property rights; and
|
||||||
|
|
||||||
|
* each contribution shall be in compliance with U.S. export control laws and
|
||||||
|
other applicable export and import laws. You agree to notify us if you
|
||||||
|
become aware of any circumstance which would make any of the foregoing
|
||||||
|
representations inaccurate in any respect. We may publicly disclose your
|
||||||
|
participation in the project, including the fact that you have signed the SCA.
|
||||||
|
|
||||||
|
6. This SCA is governed by the laws of the State of California and applicable
|
||||||
|
U.S. Federal law. Any choice of law rules will not apply.
|
||||||
|
|
||||||
|
7. Please place an “x” on one of the applicable statement below. Please do NOT
|
||||||
|
mark both statements:
|
||||||
|
|
||||||
|
* [x] I am signing on behalf of myself as an individual and no other person
|
||||||
|
or entity, including my employer, has or will have rights with respect to my
|
||||||
|
contributions.
|
||||||
|
|
||||||
|
* [ ] I am signing on behalf of my employer or a legal entity and I have the
|
||||||
|
actual authority to contractually bind that entity.
|
||||||
|
|
||||||
|
## Contributor Details
|
||||||
|
|
||||||
|
| Field | Entry |
|
||||||
|
|------------------------------- | ------------------------ |
|
||||||
|
| Name | Shirish Kadam |
|
||||||
|
| Company name (if applicable) | SlicePay |
|
||||||
|
| Title or role (if applicable) | Android Developer |
|
||||||
|
| Date | 2017-11-13 |
|
||||||
|
| GitHub username | 5hirish |
|
||||||
|
| Website (optional) | https://shirishkadam.com |
|
106
.github/contributors/therealronnie.md
vendored
Normal file
106
.github/contributors/therealronnie.md
vendored
Normal file
|
@ -0,0 +1,106 @@
|
||||||
|
# spaCy contributor agreement
|
||||||
|
|
||||||
|
This spaCy Contributor Agreement (**"SCA"**) is based on the
|
||||||
|
[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf).
|
||||||
|
The SCA applies to any contribution that you make to any product or project
|
||||||
|
managed by us (the **"project"**), and sets out the intellectual property rights
|
||||||
|
you grant to us in the contributed materials. The term **"us"** shall mean
|
||||||
|
[ExplosionAI UG (haftungsbeschränkt)](https://explosion.ai/legal). The term
|
||||||
|
**"you"** shall mean the person or entity identified below.
|
||||||
|
|
||||||
|
If you agree to be bound by these terms, fill in the information requested
|
||||||
|
below and include the filled-in version with your first pull request, under the
|
||||||
|
folder [`.github/contributors/`](/.github/contributors/). The name of the file
|
||||||
|
should be your GitHub username, with the extension `.md`. For example, the user
|
||||||
|
example_user would create the file `.github/contributors/example_user.md`.
|
||||||
|
|
||||||
|
Read this agreement carefully before signing. These terms and conditions
|
||||||
|
constitute a binding legal agreement.
|
||||||
|
|
||||||
|
## Contributor Agreement
|
||||||
|
|
||||||
|
1. The term "contribution" or "contributed materials" means any source code,
|
||||||
|
object code, patch, tool, sample, graphic, specification, manual,
|
||||||
|
documentation, or any other material posted or submitted by you to the project.
|
||||||
|
|
||||||
|
2. With respect to any worldwide copyrights, or copyright applications and
|
||||||
|
registrations, in your contribution:
|
||||||
|
|
||||||
|
* you hereby assign to us joint ownership, and to the extent that such
|
||||||
|
assignment is or becomes invalid, ineffective or unenforceable, you hereby
|
||||||
|
grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge,
|
||||||
|
royalty-free, unrestricted license to exercise all rights under those
|
||||||
|
copyrights. This includes, at our option, the right to sublicense these same
|
||||||
|
rights to third parties through multiple levels of sublicensees or other
|
||||||
|
licensing arrangements;
|
||||||
|
|
||||||
|
* you agree that each of us can do all things in relation to your
|
||||||
|
contribution as if each of us were the sole owners, and if one of us makes
|
||||||
|
a derivative work of your contribution, the one who makes the derivative
|
||||||
|
work (or has it made will be the sole owner of that derivative work;
|
||||||
|
|
||||||
|
* you agree that you will not assert any moral rights in your contribution
|
||||||
|
against us, our licensees or transferees;
|
||||||
|
|
||||||
|
* you agree that we may register a copyright in your contribution and
|
||||||
|
exercise all ownership rights associated with it; and
|
||||||
|
|
||||||
|
* you agree that neither of us has any duty to consult with, obtain the
|
||||||
|
consent of, pay or render an accounting to the other for any use or
|
||||||
|
distribution of your contribution.
|
||||||
|
|
||||||
|
3. With respect to any patents you own, or that you can license without payment
|
||||||
|
to any third party, you hereby grant to us a perpetual, irrevocable,
|
||||||
|
non-exclusive, worldwide, no-charge, royalty-free license to:
|
||||||
|
|
||||||
|
* make, have made, use, sell, offer to sell, import, and otherwise transfer
|
||||||
|
your contribution in whole or in part, alone or in combination with or
|
||||||
|
included in any product, work or materials arising out of the project to
|
||||||
|
which your contribution was submitted, and
|
||||||
|
|
||||||
|
* at our option, to sublicense these same rights to third parties through
|
||||||
|
multiple levels of sublicensees or other licensing arrangements.
|
||||||
|
|
||||||
|
4. Except as set out above, you keep all right, title, and interest in your
|
||||||
|
contribution. The rights that you grant to us under these terms are effective
|
||||||
|
on the date you first submitted a contribution to us, even if your submission
|
||||||
|
took place before the date you sign these terms.
|
||||||
|
|
||||||
|
5. You covenant, represent, warrant and agree that:
|
||||||
|
|
||||||
|
* Each contribution that you submit is and shall be an original work of
|
||||||
|
authorship and you can legally grant the rights set out in this SCA;
|
||||||
|
|
||||||
|
* to the best of your knowledge, each contribution will not violate any
|
||||||
|
third party's copyrights, trademarks, patents, or other intellectual
|
||||||
|
property rights; and
|
||||||
|
|
||||||
|
* each contribution shall be in compliance with U.S. export control laws and
|
||||||
|
other applicable export and import laws. You agree to notify us if you
|
||||||
|
become aware of any circumstance which would make any of the foregoing
|
||||||
|
representations inaccurate in any respect. We may publicly disclose your
|
||||||
|
participation in the project, including the fact that you have signed the SCA.
|
||||||
|
|
||||||
|
6. This SCA is governed by the laws of the State of California and applicable
|
||||||
|
U.S. Federal law. Any choice of law rules will not apply.
|
||||||
|
|
||||||
|
7. Please place an “x” on one of the applicable statement below. Please do NOT
|
||||||
|
mark both statements:
|
||||||
|
|
||||||
|
* [x] I am signing on behalf of myself as an individual and no other person
|
||||||
|
or entity, including my employer, has or will have rights with respect to my
|
||||||
|
contributions.
|
||||||
|
|
||||||
|
* [ ] I am signing on behalf of my employer or a legal entity and I have the
|
||||||
|
actual authority to contractually bind that entity.
|
||||||
|
|
||||||
|
## Contributor Details
|
||||||
|
|
||||||
|
| Field | Entry |
|
||||||
|
|------------------------------- | -------------------- |
|
||||||
|
| Name | Ronnie Gonzalez |
|
||||||
|
| Company name (if applicable) | |
|
||||||
|
| Title or role (if applicable) | |
|
||||||
|
| Date | 17.04.2018 |
|
||||||
|
| GitHub username | therealronnie |
|
||||||
|
| Website (optional) | |
|
|
@ -108,6 +108,18 @@ def test_doc_api_serialize(en_tokenizer, text):
|
||||||
assert [t.text for t in tokens] == [t.text for t in new_tokens]
|
assert [t.text for t in tokens] == [t.text for t in new_tokens]
|
||||||
assert [t.orth for t in tokens] == [t.orth for t in new_tokens]
|
assert [t.orth for t in tokens] == [t.orth for t in new_tokens]
|
||||||
|
|
||||||
|
new_tokens = get_doc(tokens.vocab).from_bytes(
|
||||||
|
tokens.to_bytes(tensor=False), tensor=False)
|
||||||
|
assert tokens.text == new_tokens.text
|
||||||
|
assert [t.text for t in tokens] == [t.text for t in new_tokens]
|
||||||
|
assert [t.orth for t in tokens] == [t.orth for t in new_tokens]
|
||||||
|
|
||||||
|
new_tokens = get_doc(tokens.vocab).from_bytes(
|
||||||
|
tokens.to_bytes(sentiment=False), sentiment=False)
|
||||||
|
assert tokens.text == new_tokens.text
|
||||||
|
assert [t.text for t in tokens] == [t.text for t in new_tokens]
|
||||||
|
assert [t.orth for t in tokens] == [t.orth for t in new_tokens]
|
||||||
|
|
||||||
|
|
||||||
def test_doc_api_set_ents(en_tokenizer):
|
def test_doc_api_set_ents(en_tokenizer):
|
||||||
text = "I use goggle chrone to surf the web"
|
text = "I use goggle chrone to surf the web"
|
||||||
|
|
|
@ -835,7 +835,10 @@ cdef class Doc:
|
||||||
|
|
||||||
cdef attr_t[:, :] attrs
|
cdef attr_t[:, :] attrs
|
||||||
cdef int i, start, end, has_space
|
cdef int i, start, end, has_space
|
||||||
|
|
||||||
|
if 'sentiment' not in exclude and 'sentiment' in msg:
|
||||||
self.sentiment = msg['sentiment']
|
self.sentiment = msg['sentiment']
|
||||||
|
if 'tensor' not in exclude and 'tensor' in msg:
|
||||||
self.tensor = msg['tensor']
|
self.tensor = msg['tensor']
|
||||||
|
|
||||||
start = 0
|
start = 0
|
||||||
|
|
|
@ -86,8 +86,8 @@
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
|
|
||||||
"V_CSS": "2.1.2",
|
"V_CSS": "2.1.3",
|
||||||
"V_JS": "2.1.0",
|
"V_JS": "2.1.1",
|
||||||
"DEFAULT_SYNTAX": "python",
|
"DEFAULT_SYNTAX": "python",
|
||||||
"ANALYTICS": "UA-58931649-1",
|
"ANALYTICS": "UA-58931649-1",
|
||||||
"MAILCHIMP": {
|
"MAILCHIMP": {
|
||||||
|
|
|
@ -260,8 +260,8 @@ mixin code(label, language, prompt, height, icon, wrap)
|
||||||
mixin code-exec(label, large)
|
mixin code-exec(label, large)
|
||||||
- label = (label || "Editable code example") + " (experimental)"
|
- label = (label || "Editable code example") + " (experimental)"
|
||||||
+terminal-wrapper(label, !large)
|
+terminal-wrapper(label, !large)
|
||||||
figure.thebelab-wrapper
|
figure.juniper-wrapper
|
||||||
span.thebelab-wrapper__text.u-text-tiny v#{BINDER_VERSION} · Python 3 · via #[+a("https://mybinder.org/").u-hide-link Binder]
|
span.juniper-wrapper__text.u-text-tiny v#{BINDER_VERSION} · Python 3 · via #[+a("https://mybinder.org/").u-hide-link Binder]
|
||||||
+code(data-executable="true")&attributes(attributes)
|
+code(data-executable="true")&attributes(attributes)
|
||||||
block
|
block
|
||||||
|
|
||||||
|
|
|
@ -1,15 +1,10 @@
|
||||||
//- 💫 INCLUDES > SCRIPTS
|
//- 💫 INCLUDES > SCRIPTS
|
||||||
|
|
||||||
if IS_PAGE || SECTION == "index"
|
|
||||||
script(type="text/x-thebe-config")
|
|
||||||
| { bootstrap: true, binderOptions: { repo: "#{KERNEL_BINDER}"},
|
|
||||||
| kernelOptions: { name: "#{KERNEL_PYTHON}" }}
|
|
||||||
|
|
||||||
- scripts = ["vendor/prism.min", "vendor/vue.min"]
|
- scripts = ["vendor/prism.min", "vendor/vue.min"]
|
||||||
- if (SECTION == "universe") scripts.push("vendor/vue-markdown.min")
|
- if (SECTION == "universe") scripts.push("vendor/vue-markdown.min")
|
||||||
- if (quickstart) scripts.push("vendor/quickstart.min")
|
- if (quickstart) scripts.push("vendor/quickstart.min")
|
||||||
- if (IS_PAGE) scripts.push("vendor/in-view.min")
|
- if (IS_PAGE) scripts.push("vendor/in-view.min")
|
||||||
- if (IS_PAGE || SECTION == "index") scripts.push("vendor/thebelab.custom.min")
|
- if (IS_PAGE || SECTION == "index") scripts.push("vendor/juniper.min")
|
||||||
|
|
||||||
for script in scripts
|
for script in scripts
|
||||||
script(src="/assets/js/" + script + ".js")
|
script(src="/assets/js/" + script + ".js")
|
||||||
|
|
|
@ -3,7 +3,7 @@
|
||||||
//- Code block
|
//- Code block
|
||||||
|
|
||||||
.c-code-block,
|
.c-code-block,
|
||||||
.thebelab-cell
|
.juniper-cell
|
||||||
background: $color-front
|
background: $color-front
|
||||||
color: darken($color-back, 20)
|
color: darken($color-back, 20)
|
||||||
padding: 0.75em 0
|
padding: 0.75em 0
|
||||||
|
@ -32,7 +32,7 @@
|
||||||
//- Code block content
|
//- Code block content
|
||||||
|
|
||||||
.c-code-block__content,
|
.c-code-block__content,
|
||||||
.thebelab-input,
|
.juniper-input,
|
||||||
.jp-OutputArea
|
.jp-OutputArea
|
||||||
display: block
|
display: block
|
||||||
font: normal normal 1.1rem/#{1.9} $font-code
|
font: normal normal 1.1rem/#{1.9} $font-code
|
||||||
|
@ -45,13 +45,21 @@
|
||||||
vertical-align: middle
|
vertical-align: middle
|
||||||
opacity: 0.5
|
opacity: 0.5
|
||||||
|
|
||||||
//- Thebelab
|
//- Juniper
|
||||||
|
|
||||||
[data-executable]
|
[data-executable]
|
||||||
margin-bottom: 0
|
margin-bottom: 0
|
||||||
|
|
||||||
.thebelab-input.thebelab-input
|
.juniper-cell
|
||||||
padding: 3em 2em 1em
|
border: 0
|
||||||
|
|
||||||
|
.juniper-input
|
||||||
|
padding: 0
|
||||||
|
|
||||||
|
.juniper-output
|
||||||
|
color: inherit
|
||||||
|
background: inherit
|
||||||
|
padding: 0
|
||||||
|
|
||||||
.jp-OutputArea
|
.jp-OutputArea
|
||||||
&:not(:empty)
|
&:not(:empty)
|
||||||
|
@ -75,13 +83,14 @@
|
||||||
font-family: inherit
|
font-family: inherit
|
||||||
font-weight: bold
|
font-weight: bold
|
||||||
|
|
||||||
.thebelab-run-button
|
.juniper-button
|
||||||
@extend .u-text-label, .u-text-label--dark
|
@extend .u-text-label, .u-text-label--dark
|
||||||
|
position: static
|
||||||
|
|
||||||
.thebelab-wrapper
|
.juniper-wrapper
|
||||||
position: relative
|
position: relative
|
||||||
|
|
||||||
.thebelab-wrapper__text
|
.juniper-wrapper__text
|
||||||
@include position(absolute, top, right, 1.25rem, 1.25rem)
|
@include position(absolute, top, right, 1.25rem, 1.25rem)
|
||||||
color: $color-subtle-dark
|
color: $color-subtle-dark
|
||||||
z-index: 10
|
z-index: 10
|
||||||
|
|
|
@ -36,21 +36,19 @@ import initUniverse from './universe.vue.js';
|
||||||
/**
|
/**
|
||||||
* Initialise Quickstart
|
* Initialise Quickstart
|
||||||
*/
|
*/
|
||||||
if (document.querySelector('#qs') && window.Quickstart) {
|
{
|
||||||
|
if (document.querySelector('#qs') && window.Quickstart) {
|
||||||
new Quickstart('#qs');
|
new Quickstart('#qs');
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Thebelabs
|
* Initialise Juniper
|
||||||
*/
|
*/
|
||||||
if (window.thebelab) {
|
{
|
||||||
window.thebelab.on('status', (ev, data) => {
|
if (window.Juniper) {
|
||||||
if (data.status == 'failed') {
|
new Juniper({ repo: 'ines/spacy-binder' });
|
||||||
const msg = "Failed to connect to kernel :( This can happen if too many users are active at the same time. Please reload the page and try again!";
|
|
||||||
const wrapper = `<span style="white-space: pre-wrap">${msg}</span>`;
|
|
||||||
document.querySelector('.jp-OutputArea-output pre').innerHTML = wrapper;
|
|
||||||
}
|
}
|
||||||
});
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
|
|
@ -116,7 +116,8 @@ export default function(selector, dataPath) {
|
||||||
$_updateUrl(params) {
|
$_updateUrl(params) {
|
||||||
const loc = Object.keys(params)
|
const loc = Object.keys(params)
|
||||||
.map(param => `${param}=${encodeURIComponent(params[param])}`);
|
.map(param => `${param}=${encodeURIComponent(params[param])}`);
|
||||||
const url = loc.length ? '?' + loc.join('&') : window.location.origin + window.location.pathname;
|
const url = loc.length ? '?' + loc.join('&')
|
||||||
|
: window.location.origin + window.location.pathname;
|
||||||
window.history.pushState(params, null, url);
|
window.history.pushState(params, null, url);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
9
website/assets/js/vendor/juniper.min.js
vendored
Normal file
9
website/assets/js/vendor/juniper.min.js
vendored
Normal file
File diff suppressed because one or more lines are too long
26
website/assets/js/vendor/thebelab.custom.min.js
vendored
26
website/assets/js/vendor/thebelab.custom.min.js
vendored
File diff suppressed because one or more lines are too long
|
@ -826,7 +826,31 @@
|
||||||
"thumb": "https://i.imgur.com/9MIgMAc.jpg",
|
"thumb": "https://i.imgur.com/9MIgMAc.jpg",
|
||||||
"author": "Aaron Kramer",
|
"author": "Aaron Kramer",
|
||||||
"category": ["courses"]
|
"category": ["courses"]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "adam_qas",
|
||||||
|
"title": "ADAM: Question Answering System",
|
||||||
|
"slogan": "A question answering system that extracts answers from Wikipedia to questions posed in natural language.",
|
||||||
|
"github": "5hirish/adam_qas",
|
||||||
|
"pip": "qas",
|
||||||
|
"code_example": [
|
||||||
|
"git clone https://github.com/5hirish/adam_qas.git",
|
||||||
|
"cd adam_qas",
|
||||||
|
"pip install -r requirements.txt",
|
||||||
|
"python -m qas.adam 'When was linux kernel version 4.0 released ?'"
|
||||||
|
],
|
||||||
|
"code_language": "bash",
|
||||||
|
"thumb": "https://shirishkadam.files.wordpress.com/2018/04/mini_alleviate.png",
|
||||||
|
"author": "Shirish Kadam",
|
||||||
|
"author_links": {
|
||||||
|
"twitter": "5hirish",
|
||||||
|
"github": "5hirish",
|
||||||
|
"website": "https://shirishkadam.com/"
|
||||||
|
},
|
||||||
|
"category": ["standalone"],
|
||||||
|
"tags": [ "question-answering", "elasticsearch"]
|
||||||
}
|
}
|
||||||
|
|
||||||
],
|
],
|
||||||
"projectCats": {
|
"projectCats": {
|
||||||
"pipeline": {
|
"pipeline": {
|
||||||
|
|
|
@ -276,7 +276,7 @@ p
|
||||||
nlp = spacy.load('en_core_web_sm')
|
nlp = spacy.load('en_core_web_sm')
|
||||||
matcher = Matcher(nlp.vocab)
|
matcher = Matcher(nlp.vocab)
|
||||||
# register a new token extension to flag bad HTML
|
# register a new token extension to flag bad HTML
|
||||||
Token.set_extension('bad_html', default=False, force=True)
|
Token.set_extension('bad_html', default=False)
|
||||||
|
|
||||||
def merge_and_flag(matcher, doc, i, matches):
|
def merge_and_flag(matcher, doc, i, matches):
|
||||||
match_id, start, end = matches[i]
|
match_id, start, end = matches[i]
|
||||||
|
@ -650,7 +650,7 @@ p
|
||||||
matcher.add('HASHTAG', None, [{'ORTH': '#'}, {'IS_ASCII': True}])
|
matcher.add('HASHTAG', None, [{'ORTH': '#'}, {'IS_ASCII': True}])
|
||||||
|
|
||||||
# register token extension
|
# register token extension
|
||||||
Token.set_extension('is_hashtag', default=False, force=True)
|
Token.set_extension('is_hashtag', default=False)
|
||||||
|
|
||||||
doc = nlp(u"Hello world 😀 #MondayMotivation")
|
doc = nlp(u"Hello world 😀 #MondayMotivation")
|
||||||
matches = matcher(doc)
|
matches = matcher(doc)
|
||||||
|
|
Loading…
Reference in New Issue
Block a user