diff --git a/.github/contributors/alldefector.md b/.github/contributors/alldefector.md new file mode 100644 index 000000000..a32a6dede --- /dev/null +++ b/.github/contributors/alldefector.md @@ -0,0 +1,106 @@ +# spaCy contributor agreement + +This spaCy Contributor Agreement (**"SCA"**) is based on the +[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf). +The SCA applies to any contribution that you make to any product or project +managed by us (the **"project"**), and sets out the intellectual property rights +you grant to us in the contributed materials. The term **"us"** shall mean +[ExplosionAI UG (haftungsbeschränkt)](https://explosion.ai/legal). The term +**"you"** shall mean the person or entity identified below. + +If you agree to be bound by these terms, fill in the information requested +below and include the filled-in version with your first pull request, under the +folder [`.github/contributors/`](/.github/contributors/). The name of the file +should be your GitHub username, with the extension `.md`. For example, the user +example_user would create the file `.github/contributors/example_user.md`. + +Read this agreement carefully before signing. These terms and conditions +constitute a binding legal agreement. + +## Contributor Agreement + +1. The term "contribution" or "contributed materials" means any source code, +object code, patch, tool, sample, graphic, specification, manual, +documentation, or any other material posted or submitted by you to the project. + +2. With respect to any worldwide copyrights, or copyright applications and +registrations, in your contribution: + + * you hereby assign to us joint ownership, and to the extent that such + assignment is or becomes invalid, ineffective or unenforceable, you hereby + grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge, + royalty-free, unrestricted license to exercise all rights under those + copyrights. This includes, at our option, the right to sublicense these same + rights to third parties through multiple levels of sublicensees or other + licensing arrangements; + + * you agree that each of us can do all things in relation to your + contribution as if each of us were the sole owners, and if one of us makes + a derivative work of your contribution, the one who makes the derivative + work (or has it made will be the sole owner of that derivative work; + + * you agree that you will not assert any moral rights in your contribution + against us, our licensees or transferees; + + * you agree that we may register a copyright in your contribution and + exercise all ownership rights associated with it; and + + * you agree that neither of us has any duty to consult with, obtain the + consent of, pay or render an accounting to the other for any use or + distribution of your contribution. + +3. With respect to any patents you own, or that you can license without payment +to any third party, you hereby grant to us a perpetual, irrevocable, +non-exclusive, worldwide, no-charge, royalty-free license to: + + * make, have made, use, sell, offer to sell, import, and otherwise transfer + your contribution in whole or in part, alone or in combination with or + included in any product, work or materials arising out of the project to + which your contribution was submitted, and + + * at our option, to sublicense these same rights to third parties through + multiple levels of sublicensees or other licensing arrangements. + +4. Except as set out above, you keep all right, title, and interest in your +contribution. The rights that you grant to us under these terms are effective +on the date you first submitted a contribution to us, even if your submission +took place before the date you sign these terms. + +5. You covenant, represent, warrant and agree that: + + * Each contribution that you submit is and shall be an original work of + authorship and you can legally grant the rights set out in this SCA; + + * to the best of your knowledge, each contribution will not violate any + third party's copyrights, trademarks, patents, or other intellectual + property rights; and + + * each contribution shall be in compliance with U.S. export control laws and + other applicable export and import laws. You agree to notify us if you + become aware of any circumstance which would make any of the foregoing + representations inaccurate in any respect. We may publicly disclose your + participation in the project, including the fact that you have signed the SCA. + +6. This SCA is governed by the laws of the State of California and applicable +U.S. Federal law. Any choice of law rules will not apply. + +7. Please place an “x” on one of the applicable statement below. Please do NOT +mark both statements: + + * [x] I am signing on behalf of myself as an individual and no other person + or entity, including my employer, has or will have rights with respect to my + contributions. + + * [x] I am signing on behalf of my employer or a legal entity and I have the + actual authority to contractually bind that entity. + +## Contributor Details + +| Field | Entry | +|------------------------------- | -------------------- | +| Name | Feng Niu | +| Company name (if applicable) | | +| Title or role (if applicable) | | +| Date | Feb 21, 2018 | +| GitHub username | alldefector | +| Website (optional) | | diff --git a/.github/contributors/calumcalder.md b/.github/contributors/calumcalder.md new file mode 100644 index 000000000..f2c4442af --- /dev/null +++ b/.github/contributors/calumcalder.md @@ -0,0 +1,106 @@ +# spaCy contributor agreement + +This spaCy Contributor Agreement (**"SCA"**) is based on the +[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf). +The SCA applies to any contribution that you make to any product or project +managed by us (the **"project"**), and sets out the intellectual property rights +you grant to us in the contributed materials. The term **"us"** shall mean +[ExplosionAI UG (haftungsbeschränkt)](https://explosion.ai/legal). The term +**"you"** shall mean the person or entity identified below. + +If you agree to be bound by these terms, fill in the information requested +below and include the filled-in version with your first pull request, under the +folder [`.github/contributors/`](/.github/contributors/). The name of the file +should be your GitHub username, with the extension `.md`. For example, the user +example_user would create the file `.github/contributors/example_user.md`. + +Read this agreement carefully before signing. These terms and conditions +constitute a binding legal agreement. + +## Contributor Agreement + +1. The term "contribution" or "contributed materials" means any source code, +object code, patch, tool, sample, graphic, specification, manual, +documentation, or any other material posted or submitted by you to the project. + +2. With respect to any worldwide copyrights, or copyright applications and +registrations, in your contribution: + + * you hereby assign to us joint ownership, and to the extent that such + assignment is or becomes invalid, ineffective or unenforceable, you hereby + grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge, + royalty-free, unrestricted license to exercise all rights under those + copyrights. This includes, at our option, the right to sublicense these same + rights to third parties through multiple levels of sublicensees or other + licensing arrangements; + + * you agree that each of us can do all things in relation to your + contribution as if each of us were the sole owners, and if one of us makes + a derivative work of your contribution, the one who makes the derivative + work (or has it made will be the sole owner of that derivative work; + + * you agree that you will not assert any moral rights in your contribution + against us, our licensees or transferees; + + * you agree that we may register a copyright in your contribution and + exercise all ownership rights associated with it; and + + * you agree that neither of us has any duty to consult with, obtain the + consent of, pay or render an accounting to the other for any use or + distribution of your contribution. + +3. With respect to any patents you own, or that you can license without payment +to any third party, you hereby grant to us a perpetual, irrevocable, +non-exclusive, worldwide, no-charge, royalty-free license to: + + * make, have made, use, sell, offer to sell, import, and otherwise transfer + your contribution in whole or in part, alone or in combination with or + included in any product, work or materials arising out of the project to + which your contribution was submitted, and + + * at our option, to sublicense these same rights to third parties through + multiple levels of sublicensees or other licensing arrangements. + +4. Except as set out above, you keep all right, title, and interest in your +contribution. The rights that you grant to us under these terms are effective +on the date you first submitted a contribution to us, even if your submission +took place before the date you sign these terms. + +5. You covenant, represent, warrant and agree that: + + * Each contribution that you submit is and shall be an original work of + authorship and you can legally grant the rights set out in this SCA; + + * to the best of your knowledge, each contribution will not violate any + third party's copyrights, trademarks, patents, or other intellectual + property rights; and + + * each contribution shall be in compliance with U.S. export control laws and + other applicable export and import laws. You agree to notify us if you + become aware of any circumstance which would make any of the foregoing + representations inaccurate in any respect. We may publicly disclose your + participation in the project, including the fact that you have signed the SCA. + +6. This SCA is governed by the laws of the State of California and applicable +U.S. Federal law. Any choice of law rules will not apply. + +7. Please place an “x” on one of the applicable statement below. Please do NOT +mark both statements: + + * [x] I am signing on behalf of myself as an individual and no other person + or entity, including my employer, has or will have rights with respect to my + contributions. + + * [] I am signing on behalf of my employer or a legal entity and I have the + actual authority to contractually bind that entity. + +## Contributor Details + +| Field | Entry | +|------------------------------- | -------------------- | +| Name | Calum Calder | +| Company name (if applicable) | | +| Title or role (if applicable) | | +| Date | 22 March 2018 | +| GitHub username | calumcalder | +| Website (optional) | | diff --git a/.github/contributors/doug-descombaz.md b/.github/contributors/doug-descombaz.md new file mode 100644 index 000000000..210bb7296 --- /dev/null +++ b/.github/contributors/doug-descombaz.md @@ -0,0 +1,106 @@ +# spaCy contributor agreement + +This spaCy Contributor Agreement (**"SCA"**) is based on the +[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf). +The SCA applies to any contribution that you make to any product or project +managed by us (the **"project"**), and sets out the intellectual property rights +you grant to us in the contributed materials. The term **"us"** shall mean +[ExplosionAI UG (haftungsbeschränkt)](https://explosion.ai/legal). The term +**"you"** shall mean the person or entity identified below. + +If you agree to be bound by these terms, fill in the information requested +below and include the filled-in version with your first pull request, under the +folder [`.github/contributors/`](/.github/contributors/). The name of the file +should be your GitHub username, with the extension `.md`. For example, the user +example_user would create the file `.github/contributors/example_user.md`. + +Read this agreement carefully before signing. These terms and conditions +constitute a binding legal agreement. + +## Contributor Agreement + +1. The term "contribution" or "contributed materials" means any source code, +object code, patch, tool, sample, graphic, specification, manual, +documentation, or any other material posted or submitted by you to the project. + +2. With respect to any worldwide copyrights, or copyright applications and +registrations, in your contribution: + + * you hereby assign to us joint ownership, and to the extent that such + assignment is or becomes invalid, ineffective or unenforceable, you hereby + grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge, + royalty-free, unrestricted license to exercise all rights under those + copyrights. This includes, at our option, the right to sublicense these same + rights to third parties through multiple levels of sublicensees or other + licensing arrangements; + + * you agree that each of us can do all things in relation to your + contribution as if each of us were the sole owners, and if one of us makes + a derivative work of your contribution, the one who makes the derivative + work (or has it made will be the sole owner of that derivative work; + + * you agree that you will not assert any moral rights in your contribution + against us, our licensees or transferees; + + * you agree that we may register a copyright in your contribution and + exercise all ownership rights associated with it; and + + * you agree that neither of us has any duty to consult with, obtain the + consent of, pay or render an accounting to the other for any use or + distribution of your contribution. + +3. With respect to any patents you own, or that you can license without payment +to any third party, you hereby grant to us a perpetual, irrevocable, +non-exclusive, worldwide, no-charge, royalty-free license to: + + * make, have made, use, sell, offer to sell, import, and otherwise transfer + your contribution in whole or in part, alone or in combination with or + included in any product, work or materials arising out of the project to + which your contribution was submitted, and + + * at our option, to sublicense these same rights to third parties through + multiple levels of sublicensees or other licensing arrangements. + +4. Except as set out above, you keep all right, title, and interest in your +contribution. The rights that you grant to us under these terms are effective +on the date you first submitted a contribution to us, even if your submission +took place before the date you sign these terms. + +5. You covenant, represent, warrant and agree that: + + * Each contribution that you submit is and shall be an original work of + authorship and you can legally grant the rights set out in this SCA; + + * to the best of your knowledge, each contribution will not violate any + third party's copyrights, trademarks, patents, or other intellectual + property rights; and + + * each contribution shall be in compliance with U.S. export control laws and + other applicable export and import laws. You agree to notify us if you + become aware of any circumstance which would make any of the foregoing + representations inaccurate in any respect. We may publicly disclose your + participation in the project, including the fact that you have signed the SCA. + +6. This SCA is governed by the laws of the State of California and applicable +U.S. Federal law. Any choice of law rules will not apply. + +7. Please place an “x” on one of the applicable statement below. Please do NOT +mark both statements: + + * [x] I am signing on behalf of myself as an individual and no other person + or entity, including my employer, has or will have rights with respect my + contributions. + + * [ ] I am signing on behalf of my employer or a legal entity and I have the + actual authority to contractually bind that entity. + +## Contributor Details + +| Field | Entry | +|------------------------------- | -------------------- | +| Name | Doug DesCombaz | +| Company name (if applicable) | | +| Title or role (if applicable) | | +| Date | 2018-03-15 | +| GitHub username | doug-descombaz | +| Website (optional) | https://medium.com/@doug.descombaz | diff --git a/.github/contributors/howl-anderson.md b/.github/contributors/howl-anderson.md new file mode 100644 index 000000000..902d35426 --- /dev/null +++ b/.github/contributors/howl-anderson.md @@ -0,0 +1,106 @@ +# spaCy contributor agreement + +This spaCy Contributor Agreement (**"SCA"**) is based on the +[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf). +The SCA applies to any contribution that you make to any product or project +managed by us (the **"project"**), and sets out the intellectual property rights +you grant to us in the contributed materials. The term **"us"** shall mean +[ExplosionAI UG (haftungsbeschränkt)](https://explosion.ai/legal). The term +**"you"** shall mean the person or entity identified below. + +If you agree to be bound by these terms, fill in the information requested +below and include the filled-in version with your first pull request, under the +folder [`.github/contributors/`](/.github/contributors/). The name of the file +should be your GitHub username, with the extension `.md`. For example, the user +example_user would create the file `.github/contributors/example_user.md`. + +Read this agreement carefully before signing. These terms and conditions +constitute a binding legal agreement. + +## Contributor Agreement + +1. The term "contribution" or "contributed materials" means any source code, +object code, patch, tool, sample, graphic, specification, manual, +documentation, or any other material posted or submitted by you to the project. + +2. With respect to any worldwide copyrights, or copyright applications and +registrations, in your contribution: + + * you hereby assign to us joint ownership, and to the extent that such + assignment is or becomes invalid, ineffective or unenforceable, you hereby + grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge, + royalty-free, unrestricted license to exercise all rights under those + copyrights. This includes, at our option, the right to sublicense these same + rights to third parties through multiple levels of sublicensees or other + licensing arrangements; + + * you agree that each of us can do all things in relation to your + contribution as if each of us were the sole owners, and if one of us makes + a derivative work of your contribution, the one who makes the derivative + work (or has it made will be the sole owner of that derivative work; + + * you agree that you will not assert any moral rights in your contribution + against us, our licensees or transferees; + + * you agree that we may register a copyright in your contribution and + exercise all ownership rights associated with it; and + + * you agree that neither of us has any duty to consult with, obtain the + consent of, pay or render an accounting to the other for any use or + distribution of your contribution. + +3. With respect to any patents you own, or that you can license without payment +to any third party, you hereby grant to us a perpetual, irrevocable, +non-exclusive, worldwide, no-charge, royalty-free license to: + + * make, have made, use, sell, offer to sell, import, and otherwise transfer + your contribution in whole or in part, alone or in combination with or + included in any product, work or materials arising out of the project to + which your contribution was submitted, and + + * at our option, to sublicense these same rights to third parties through + multiple levels of sublicensees or other licensing arrangements. + +4. Except as set out above, you keep all right, title, and interest in your +contribution. The rights that you grant to us under these terms are effective +on the date you first submitted a contribution to us, even if your submission +took place before the date you sign these terms. + +5. You covenant, represent, warrant and agree that: + + * Each contribution that you submit is and shall be an original work of + authorship and you can legally grant the rights set out in this SCA; + + * to the best of your knowledge, each contribution will not violate any + third party's copyrights, trademarks, patents, or other intellectual + property rights; and + + * each contribution shall be in compliance with U.S. export control laws and + other applicable export and import laws. You agree to notify us if you + become aware of any circumstance which would make any of the foregoing + representations inaccurate in any respect. We may publicly disclose your + participation in the project, including the fact that you have signed the SCA. + +6. This SCA is governed by the laws of the State of California and applicable +U.S. Federal law. Any choice of law rules will not apply. + +7. Please place an “x” on one of the applicable statement below. Please do NOT +mark both statements: + + * [x] I am signing on behalf of myself as an individual and no other person + or entity, including my employer, has or will have rights with respect to my + contributions. + + * [ ] I am signing on behalf of my employer or a legal entity and I have the + actual authority to contractually bind that entity. + +## Contributor Details + +| Field | Entry | +|------------------------------- | -------------------- | +| Name | Xiaoquan Kong | +| Company name (if applicable) | | +| Title or role (if applicable) | | +| Date | 2018-03-23 | +| GitHub username | howl-anderson | +| Website (optional) | | diff --git a/.github/contributors/iann0036.md b/.github/contributors/iann0036.md new file mode 100644 index 000000000..969c9ae85 --- /dev/null +++ b/.github/contributors/iann0036.md @@ -0,0 +1,106 @@ +# spaCy contributor agreement + +This spaCy Contributor Agreement (**"SCA"**) is based on the +[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf). +The SCA applies to any contribution that you make to any product or project +managed by us (the **"project"**), and sets out the intellectual property rights +you grant to us in the contributed materials. The term **"us"** shall mean +[ExplosionAI UG (haftungsbeschränkt)](https://explosion.ai/legal). The term +**"you"** shall mean the person or entity identified below. + +If you agree to be bound by these terms, fill in the information requested +below and include the filled-in version with your first pull request, under the +folder [`.github/contributors/`](/.github/contributors/). The name of the file +should be your GitHub username, with the extension `.md`. For example, the user +example_user would create the file `.github/contributors/example_user.md`. + +Read this agreement carefully before signing. These terms and conditions +constitute a binding legal agreement. + +## Contributor Agreement + +1. The term "contribution" or "contributed materials" means any source code, +object code, patch, tool, sample, graphic, specification, manual, +documentation, or any other material posted or submitted by you to the project. + +2. With respect to any worldwide copyrights, or copyright applications and +registrations, in your contribution: + + * you hereby assign to us joint ownership, and to the extent that such + assignment is or becomes invalid, ineffective or unenforceable, you hereby + grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge, + royalty-free, unrestricted license to exercise all rights under those + copyrights. This includes, at our option, the right to sublicense these same + rights to third parties through multiple levels of sublicensees or other + licensing arrangements; + + * you agree that each of us can do all things in relation to your + contribution as if each of us were the sole owners, and if one of us makes + a derivative work of your contribution, the one who makes the derivative + work (or has it made will be the sole owner of that derivative work; + + * you agree that you will not assert any moral rights in your contribution + against us, our licensees or transferees; + + * you agree that we may register a copyright in your contribution and + exercise all ownership rights associated with it; and + + * you agree that neither of us has any duty to consult with, obtain the + consent of, pay or render an accounting to the other for any use or + distribution of your contribution. + +3. With respect to any patents you own, or that you can license without payment +to any third party, you hereby grant to us a perpetual, irrevocable, +non-exclusive, worldwide, no-charge, royalty-free license to: + + * make, have made, use, sell, offer to sell, import, and otherwise transfer + your contribution in whole or in part, alone or in combination with or + included in any product, work or materials arising out of the project to + which your contribution was submitted, and + + * at our option, to sublicense these same rights to third parties through + multiple levels of sublicensees or other licensing arrangements. + +4. Except as set out above, you keep all right, title, and interest in your +contribution. The rights that you grant to us under these terms are effective +on the date you first submitted a contribution to us, even if your submission +took place before the date you sign these terms. + +5. You covenant, represent, warrant and agree that: + + * Each contribution that you submit is and shall be an original work of + authorship and you can legally grant the rights set out in this SCA; + + * to the best of your knowledge, each contribution will not violate any + third party's copyrights, trademarks, patents, or other intellectual + property rights; and + + * each contribution shall be in compliance with U.S. export control laws and + other applicable export and import laws. You agree to notify us if you + become aware of any circumstance which would make any of the foregoing + representations inaccurate in any respect. We may publicly disclose your + participation in the project, including the fact that you have signed the SCA. + +6. This SCA is governed by the laws of the State of California and applicable +U.S. Federal law. Any choice of law rules will not apply. + +7. Please place an “x” on one of the applicable statement below. Please do NOT +mark both statements: + + * [x] I am signing on behalf of myself as an individual and no other person + or entity, including my employer, has or will have rights with respect to my + contributions. + + * [x] I am signing on behalf of my employer or a legal entity and I have the + actual authority to contractually bind that entity. + +## Contributor Details + +| Field | Entry | +|------------------------------- | -------------------- | +| Name | Ian Mckay | +| Company name (if applicable) | | +| Title or role (if applicable) | | +| Date | 22/03/2018 | +| GitHub username | iann0036 | +| Website (optional) | | diff --git a/.github/contributors/justindujardin.md b/.github/contributors/justindujardin.md new file mode 100644 index 000000000..35d403acb --- /dev/null +++ b/.github/contributors/justindujardin.md @@ -0,0 +1,106 @@ +# spaCy contributor agreement + +This spaCy Contributor Agreement (**"SCA"**) is based on the +[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf). +The SCA applies to any contribution that you make to any product or project +managed by us (the **"project"**), and sets out the intellectual property rights +you grant to us in the contributed materials. The term **"us"** shall mean +[ExplosionAI UG (haftungsbeschränkt)](https://explosion.ai/legal). The term +**"you"** shall mean the person or entity identified below. + +If you agree to be bound by these terms, fill in the information requested +below and include the filled-in version with your first pull request, under the +folder [`.github/contributors/`](/.github/contributors/). The name of the file +should be your GitHub username, with the extension `.md`. For example, the user +example_user would create the file `.github/contributors/example_user.md`. + +Read this agreement carefully before signing. These terms and conditions +constitute a binding legal agreement. + +## Contributor Agreement + +1. The term "contribution" or "contributed materials" means any source code, +object code, patch, tool, sample, graphic, specification, manual, +documentation, or any other material posted or submitted by you to the project. + +2. With respect to any worldwide copyrights, or copyright applications and +registrations, in your contribution: + + * you hereby assign to us joint ownership, and to the extent that such + assignment is or becomes invalid, ineffective or unenforceable, you hereby + grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge, + royalty-free, unrestricted license to exercise all rights under those + copyrights. This includes, at our option, the right to sublicense these same + rights to third parties through multiple levels of sublicensees or other + licensing arrangements; + + * you agree that each of us can do all things in relation to your + contribution as if each of us were the sole owners, and if one of us makes + a derivative work of your contribution, the one who makes the derivative + work (or has it made will be the sole owner of that derivative work; + + * you agree that you will not assert any moral rights in your contribution + against us, our licensees or transferees; + + * you agree that we may register a copyright in your contribution and + exercise all ownership rights associated with it; and + + * you agree that neither of us has any duty to consult with, obtain the + consent of, pay or render an accounting to the other for any use or + distribution of your contribution. + +3. With respect to any patents you own, or that you can license without payment +to any third party, you hereby grant to us a perpetual, irrevocable, +non-exclusive, worldwide, no-charge, royalty-free license to: + + * make, have made, use, sell, offer to sell, import, and otherwise transfer + your contribution in whole or in part, alone or in combination with or + included in any product, work or materials arising out of the project to + which your contribution was submitted, and + + * at our option, to sublicense these same rights to third parties through + multiple levels of sublicensees or other licensing arrangements. + +4. Except as set out above, you keep all right, title, and interest in your +contribution. The rights that you grant to us under these terms are effective +on the date you first submitted a contribution to us, even if your submission +took place before the date you sign these terms. + +5. You covenant, represent, warrant and agree that: + + * Each contribution that you submit is and shall be an original work of + authorship and you can legally grant the rights set out in this SCA; + + * to the best of your knowledge, each contribution will not violate any + third party's copyrights, trademarks, patents, or other intellectual + property rights; and + + * each contribution shall be in compliance with U.S. export control laws and + other applicable export and import laws. You agree to notify us if you + become aware of any circumstance which would make any of the foregoing + representations inaccurate in any respect. We may publicly disclose your + participation in the project, including the fact that you have signed the SCA. + +6. This SCA is governed by the laws of the State of California and applicable +U.S. Federal law. Any choice of law rules will not apply. + +7. Please place an “x” on one of the applicable statement below. Please do NOT +mark both statements: + + * [x] I am signing on behalf of myself as an individual and no other person + or entity, including my employer, has or will have rights with respect to my + contributions. + + * [ ] I am signing on behalf of my employer or a legal entity and I have the + actual authority to contractually bind that entity. + +## Contributor Details + +| Field | Entry | +|------------------------------- | -------------------- | +| Name | Justin DuJardin | +| Company name (if applicable) | DuJardin Consulting, LLC | +| Title or role (if applicable) | | +| Date | 2018-03-23 | +| GitHub username | justindujardin | +| Website (optional) | https://justindujardin.com | diff --git a/.github/contributors/ottosulin.md b/.github/contributors/ottosulin.md new file mode 100644 index 000000000..83975b74c --- /dev/null +++ b/.github/contributors/ottosulin.md @@ -0,0 +1,106 @@ +# spaCy contributor agreement + +This spaCy Contributor Agreement (**"SCA"**) is based on the +[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf). +The SCA applies to any contribution that you make to any product or project +managed by us (the **"project"**), and sets out the intellectual property rights +you grant to us in the contributed materials. The term **"us"** shall mean +[ExplosionAI UG (haftungsbeschr�nkt)](https://explosion.ai/legal). The term +**"you"** shall mean the person or entity identified below. + +If you agree to be bound by these terms, fill in the information requested +below and include the filled-in version with your first pull request, under the +folder [`.github/contributors/`](/.github/contributors/). The name of the file +should be your GitHub username, with the extension `.md`. For example, the user +example_user would create the file `.github/contributors/example_user.md`. + +Read this agreement carefully before signing. These terms and conditions +constitute a binding legal agreement. + +## Contributor Agreement + +1. The term "contribution" or "contributed materials" means any source code, +object code, patch, tool, sample, graphic, specification, manual, +documentation, or any other material posted or submitted by you to the project. + +2. With respect to any worldwide copyrights, or copyright applications and +registrations, in your contribution: + + * you hereby assign to us joint ownership, and to the extent that such + assignment is or becomes invalid, ineffective or unenforceable, you hereby + grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge, + royalty-free, unrestricted license to exercise all rights under those + copyrights. This includes, at our option, the right to sublicense these same + rights to third parties through multiple levels of sublicensees or other + licensing arrangements; + + * you agree that each of us can do all things in relation to your + contribution as if each of us were the sole owners, and if one of us makes + a derivative work of your contribution, the one who makes the derivative + work (or has it made will be the sole owner of that derivative work; + + * you agree that you will not assert any moral rights in your contribution + against us, our licensees or transferees; + + * you agree that we may register a copyright in your contribution and + exercise all ownership rights associated with it; and + + * you agree that neither of us has any duty to consult with, obtain the + consent of, pay or render an accounting to the other for any use or + distribution of your contribution. + +3. With respect to any patents you own, or that you can license without payment +to any third party, you hereby grant to us a perpetual, irrevocable, +non-exclusive, worldwide, no-charge, royalty-free license to: + + * make, have made, use, sell, offer to sell, import, and otherwise transfer + your contribution in whole or in part, alone or in combination with or + included in any product, work or materials arising out of the project to + which your contribution was submitted, and + + * at our option, to sublicense these same rights to third parties through + multiple levels of sublicensees or other licensing arrangements. + +4. Except as set out above, you keep all right, title, and interest in your +contribution. The rights that you grant to us under these terms are effective +on the date you first submitted a contribution to us, even if your submission +took place before the date you sign these terms. + +5. You covenant, represent, warrant and agree that: + + * Each contribution that you submit is and shall be an original work of + authorship and you can legally grant the rights set out in this SCA; + + * to the best of your knowledge, each contribution will not violate any + third party's copyrights, trademarks, patents, or other intellectual + property rights; and + + * each contribution shall be in compliance with U.S. export control laws and + other applicable export and import laws. You agree to notify us if you + become aware of any circumstance which would make any of the foregoing + representations inaccurate in any respect. We may publicly disclose your + participation in the project, including the fact that you have signed the SCA. + +6. This SCA is governed by the laws of the State of California and applicable +U.S. Federal law. Any choice of law rules will not apply. + +7. Please place an �x� on one of the applicable statement below. Please do NOT +mark both statements: + + * [ X ] I am signing on behalf of myself as an individual and no other person + or entity, including my employer, has or will have rights with respect to my + contributions. + + * [ ] I am signing on behalf of my employer or a legal entity and I have the + actual authority to contractually bind that entity. + +## Contributor Details + +| Field | Entry | +|------------------------------- | -------------------- | +| Name | Otto Sulin | +| Company name (if applicable) | | +| Title or role (if applicable) | | +| Date | 23/03/2018 | +| GitHub username | ottosulin | +| Website (optional) | | diff --git a/.github/contributors/willismonroe.md b/.github/contributors/willismonroe.md new file mode 100644 index 000000000..3a6f1054a --- /dev/null +++ b/.github/contributors/willismonroe.md @@ -0,0 +1,106 @@ +# spaCy contributor agreement + +This spaCy Contributor Agreement (**"SCA"**) is based on the +[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf). +The SCA applies to any contribution that you make to any product or project +managed by us (the **"project"**), and sets out the intellectual property rights +you grant to us in the contributed materials. The term **"us"** shall mean +[ExplosionAI UG (haftungsbeschränkt)](https://explosion.ai/legal). The term +**"you"** shall mean the person or entity identified below. + +If you agree to be bound by these terms, fill in the information requested +below and include the filled-in version with your first pull request, under the +folder [`.github/contributors/`](/.github/contributors/). The name of the file +should be your GitHub username, with the extension `.md`. For example, the user +example_user would create the file `.github/contributors/example_user.md`. + +Read this agreement carefully before signing. These terms and conditions +constitute a binding legal agreement. + +## Contributor Agreement + +1. The term "contribution" or "contributed materials" means any source code, +object code, patch, tool, sample, graphic, specification, manual, +documentation, or any other material posted or submitted by you to the project. + +2. With respect to any worldwide copyrights, or copyright applications and +registrations, in your contribution: + + * you hereby assign to us joint ownership, and to the extent that such + assignment is or becomes invalid, ineffective or unenforceable, you hereby + grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge, + royalty-free, unrestricted license to exercise all rights under those + copyrights. This includes, at our option, the right to sublicense these same + rights to third parties through multiple levels of sublicensees or other + licensing arrangements; + + * you agree that each of us can do all things in relation to your + contribution as if each of us were the sole owners, and if one of us makes + a derivative work of your contribution, the one who makes the derivative + work (or has it made will be the sole owner of that derivative work; + + * you agree that you will not assert any moral rights in your contribution + against us, our licensees or transferees; + + * you agree that we may register a copyright in your contribution and + exercise all ownership rights associated with it; and + + * you agree that neither of us has any duty to consult with, obtain the + consent of, pay or render an accounting to the other for any use or + distribution of your contribution. + +3. With respect to any patents you own, or that you can license without payment +to any third party, you hereby grant to us a perpetual, irrevocable, +non-exclusive, worldwide, no-charge, royalty-free license to: + + * make, have made, use, sell, offer to sell, import, and otherwise transfer + your contribution in whole or in part, alone or in combination with or + included in any product, work or materials arising out of the project to + which your contribution was submitted, and + + * at our option, to sublicense these same rights to third parties through + multiple levels of sublicensees or other licensing arrangements. + +4. Except as set out above, you keep all right, title, and interest in your +contribution. The rights that you grant to us under these terms are effective +on the date you first submitted a contribution to us, even if your submission +took place before the date you sign these terms. + +5. You covenant, represent, warrant and agree that: + + * Each contribution that you submit is and shall be an original work of + authorship and you can legally grant the rights set out in this SCA; + + * to the best of your knowledge, each contribution will not violate any + third party's copyrights, trademarks, patents, or other intellectual + property rights; and + + * each contribution shall be in compliance with U.S. export control laws and + other applicable export and import laws. You agree to notify us if you + become aware of any circumstance which would make any of the foregoing + representations inaccurate in any respect. We may publicly disclose your + participation in the project, including the fact that you have signed the SCA. + +6. This SCA is governed by the laws of the State of California and applicable +U.S. Federal law. Any choice of law rules will not apply. + +7. Please place an “x” on one of the applicable statement below. Please do NOT +mark both statements: + + * [x] I am signing on behalf of myself as an individual and no other person + or entity, including my employer, has or will have rights with respect to my + contributions. + + * [x] I am signing on behalf of my employer or a legal entity and I have the + actual authority to contractually bind that entity. + +## Contributor Details + +| Field | Entry | +|------------------------------- | -------------------- | +| Name | Willis Monroe | +| Company name (if applicable) | | +| Title or role (if applicable) | | +| Date | 2018-3-5 | +| GitHub username | willismonroe | +| Website (optional) | | diff --git a/examples/vectors_tensorboard.py b/examples/vectors_tensorboard.py new file mode 100644 index 000000000..f29193345 --- /dev/null +++ b/examples/vectors_tensorboard.py @@ -0,0 +1,82 @@ +#!/usr/bin/env python +# coding: utf8 +"""Visualize spaCy word vectors in Tensorboard. + +Adapted from: https://gist.github.com/BrikerMan/7bd4e4bd0a00ac9076986148afc06507 +""" +from __future__ import unicode_literals + +from os import path + +import math +import numpy +import plac +import spacy +import tensorflow as tf +import tqdm +from tensorflow.contrib.tensorboard.plugins.projector import visualize_embeddings, ProjectorConfig + + +@plac.annotations( + vectors_loc=("Path to spaCy model that contains vectors", "positional", None, str), + out_loc=("Path to output folder for tensorboard session data", "positional", None, str), + name=("Human readable name for tsv file and vectors tensor", "positional", None, str), +) +def main(vectors_loc, out_loc, name="spaCy_vectors"): + meta_file = "{}.tsv".format(name) + out_meta_file = path.join(out_loc, meta_file) + + print('Loading spaCy vectors model: {}'.format(vectors_loc)) + model = spacy.load(vectors_loc) + print('Finding lexemes with vectors attached: {}'.format(vectors_loc)) + strings_stream = tqdm.tqdm(model.vocab.strings, total=len(model.vocab.strings), leave=False) + queries = [w for w in strings_stream if model.vocab.has_vector(w)] + vector_count = len(queries) + + print('Building Tensorboard Projector metadata for ({}) vectors: {}'.format(vector_count, out_meta_file)) + + # Store vector data in a tensorflow variable + tf_vectors_variable = numpy.zeros((vector_count, model.vocab.vectors.shape[1])) + + # Write a tab-separated file that contains information about the vectors for visualization + # + # Reference: https://www.tensorflow.org/programmers_guide/embedding#metadata + with open(out_meta_file, 'wb') as file_metadata: + # Define columns in the first row + file_metadata.write("Text\tFrequency\n".encode('utf-8')) + # Write out a row for each vector that we add to the tensorflow variable we created + vec_index = 0 + for text in tqdm.tqdm(queries, total=len(queries), leave=False): + # https://github.com/tensorflow/tensorflow/issues/9094 + text = '' if text.lstrip() == '' else text + lex = model.vocab[text] + + # Store vector data and metadata + tf_vectors_variable[vec_index] = model.vocab.get_vector(text) + file_metadata.write("{}\t{}\n".format(text, math.exp(lex.prob) * vector_count).encode('utf-8')) + vec_index += 1 + + print('Running Tensorflow Session...') + sess = tf.InteractiveSession() + tf.Variable(tf_vectors_variable, trainable=False, name=name) + tf.global_variables_initializer().run() + saver = tf.train.Saver() + writer = tf.summary.FileWriter(out_loc, sess.graph) + + # Link the embeddings into the config + config = ProjectorConfig() + embed = config.embeddings.add() + embed.tensor_name = name + embed.metadata_path = meta_file + + # Tell the projector about the configured embeddings and metadata file + visualize_embeddings(writer, config) + + # Save session and print run command to the output + print('Saving Tensorboard Session...') + saver.save(sess, path.join(out_loc, '{}.ckpt'.format(name))) + print('Done. Run `tensorboard --logdir={0}` to view in Tensorboard'.format(out_loc)) + + +if __name__ == '__main__': + plac.call(main) diff --git a/spacy/lang/tr/examples.py b/spacy/lang/tr/examples.py new file mode 100644 index 000000000..e17586a37 --- /dev/null +++ b/spacy/lang/tr/examples.py @@ -0,0 +1,22 @@ +# coding: utf8 +from __future__ import unicode_literals + + +""" +Example sentences to test spaCy and its language models. +>>> from spacy.lang.tr.examples import sentences +>>> docs = nlp.pipe(sentences) +""" + + +sentences = [ + "Neredesin?", + "Neredesiniz?", + "Bu bir cümledir.", + "Sürücüsüz araçlar sigorta yükümlülüğünü üreticilere kaydırıyor.", + "San Francisco kaldırımda kurye robotları yasaklayabilir." + "Londra İngiltere'nin başkentidir.", + "Türkiye'nin başkenti neresi?", + "Bakanlar Kurulu 180 günlük eylem planını açıkladı.", + "Merkez Bankası, beklentiler doğrultusunda faizlerde değişikliğe gitmedi." +] diff --git a/spacy/lang/tr/lex_attrs.py b/spacy/lang/tr/lex_attrs.py new file mode 100644 index 000000000..862a64825 --- /dev/null +++ b/spacy/lang/tr/lex_attrs.py @@ -0,0 +1,31 @@ +# coding: utf8 +from __future__ import unicode_literals + +from ...attrs import LIKE_NUM + + +#Thirteen, fifteen etc. are written separate: on üç + +_num_words = ['bir', 'iki', 'üç', 'dört', 'beş', 'altı', 'yedi', 'sekiz', + 'dokuz', 'on', 'yirmi', 'otuz', 'kırk', 'elli', 'altmış', + 'yetmiş', 'seksen', 'doksan', 'yüz', 'bin', 'milyon', + 'milyar', 'katrilyon', 'kentilyon'] + + +def like_num(text): + text = text.replace(',', '').replace('.', '') + if text.isdigit(): + return True + if text.count('/') == 1: + num, denom = text.split('/') + if num.isdigit() and denom.isdigit(): + return True + if text.lower() in _num_words: + return True + return False + + +LEX_ATTRS = { + LIKE_NUM: like_num +} + diff --git a/spacy/lang/tr/stop_words.py b/spacy/lang/tr/stop_words.py index aaed02a3e..840fcc13e 100644 --- a/spacy/lang/tr/stop_words.py +++ b/spacy/lang/tr/stop_words.py @@ -10,16 +10,12 @@ acep adamakıllı adeta ait -altmýþ -altmış -altý -altı ama amma anca ancak arada -artýk +artık aslında aynen ayrıca @@ -29,46 +25,82 @@ açıkçası bana bari bazen -bazý bazı +bazısı +bazısına +bazısında +bazısından +bazısını +bazısının başkası -baţka +başkasına +başkasında +başkasından +başkasını +başkasının +başka belki ben +bende benden beni benim beri beriki -beþ -beş -beţ +berikinin +berikiyi +berisi bilcümle bile -bin binaen binaenaleyh -bir biraz birazdan birbiri +birbirine +birbirini +birbirinin +birbirinde +birbirinden birden birdenbire biri +birine +birini +birinin +birinde +birinden birice birileri +birilerinde +birilerinden +birilerine +birilerini +birilerinin birisi +birisine +birisini +birisinin +birisinde +birisinden birkaç birkaçı +birkaçına +birkaçını +birkaçının +birkaçında +birkaçından birkez birlikte birçok birçoğu -birþey -birþeyi +birçoğuna +birçoğunda +birçoğundan +birçoğunu +birçoğunun birşey birşeyi -birţey bitevi biteviye bittabi @@ -96,6 +128,11 @@ buracıkta burada buradan burası +burasına +burasını +burasının +burasında +burasından böyle böylece böylecene @@ -106,8 +143,34 @@ büsbütün bütün cuk cümlesi +cümlesine +cümlesini +cümlesinin +cümlesinden +cümlemize +cümlemizi +cümlemizden +çabuk +çabukça +çeşitli +çok +çokları +çoklarınca +çokluk +çoklukla +çokça +çoğu +çoğun +çoğunca +çoğunda +çoğundan +çoğunlukla +çoğunu +çoğunun +çünkü da daha +dahası dahi dahil dahilen @@ -124,19 +187,17 @@ denli derakap derhal derken -deđil değil değin diye -diđer diğer diğeri -doksan -dokuz +diğerine +diğerini +diğerinden dolayı dolayısıyla doğru -dört edecek eden ederek @@ -146,7 +207,6 @@ edilmesi ediyor elbet elbette -elli emme en enikonu @@ -168,10 +228,10 @@ evvelce evvelden evvelemirde evveli -eđer eğer fakat filanca +filancanın gah gayet gayetle @@ -197,6 +257,10 @@ haliyle handiyse hangi hangisi +hangisine +hangisine +hangisinde +hangisinden hani hariç hasebiyle @@ -207,17 +271,27 @@ hem henüz hep hepsi +hepsini +hepsinin +hepsinde +hepsinden her herhangi herkes +herkesi herkesin +herkesten hiç hiçbir hiçbiri +hiçbirine +hiçbirini +hiçbirinin +hiçbirinde +hiçbirinden hoş hulasaten iken -iki ila ile ilen @@ -240,43 +314,55 @@ iyicene için iş işte -iţte kadar kaffesi kah kala -kanýmca +kanımca karşın -katrilyon kaynak kaçı +kaçına +kaçında +kaçından +kaçını +kaçının kelli kendi +kendilerinde +kendilerinden kendilerine +kendilerini +kendilerinin kendini kendisi +kendisinde +kendisinden kendisine kendisini +kendisinin kere kez keza kezalik keşke -keţke ki kim kimden kime kimi +kiminin kimisi +kimisinde +kimisinden +kimisine +kimisinin kimse kimsecik kimsecikler külliyen -kýrk -kýsaca -kırk kısaca +kısacası lakin leh lütfen @@ -289,13 +375,10 @@ međer meğer meğerki meğerse -milyar -milyon mu mü -mý mı -nasýl +mi nasıl nasılsa nazaran @@ -304,6 +387,8 @@ ne neden nedeniyle nedenle +nedenler +nedenlerden nedense nerde nerden @@ -332,32 +417,27 @@ olduklarını oldukça olduğu olduğunu -olmadı -olmadığı olmak olması -olmayan -olmaz olsa olsun olup olur olursa oluyor -on ona onca onculayın onda ondan onlar +onlara onlardan -onlari -onlarýn onları onların onu onun +ora oracık oracıkta orada @@ -365,9 +445,26 @@ oradan oranca oranla oraya -otuz oysa oysaki +öbür +öbürkü +öbürü +öbüründe +öbüründen +öbürüne +öbürünü +önce +önceden +önceleri +öncelikle +öteki +ötekisi +öyle +öylece +öylelikle +öylemesine +öz pek pekala peki @@ -379,8 +476,6 @@ sahi sahiden sana sanki -sekiz -seksen sen senden seni @@ -393,6 +488,27 @@ sonra sonradan sonraları sonunda +şayet +şey +şeyden +şeyi +şeyler +şu +şuna +şuncacık +şunda +şundan +şunlar +şunları +şunların +şunu +şunun +şura +şuracık +şuracıkta +şurası +şöyle +şimdi tabii tam tamam @@ -400,8 +516,8 @@ tamamen tamamıyla tarafından tek -trilyon tüm +üzere var vardı vasıtasıyla @@ -429,84 +545,16 @@ yaptığını yapılan yapılması yapıyor -yedi yeniden yenilerde yerine -yetmiþ -yetmiş -yetmiţ yine -yirmi yok yoksa yoluyla -yüz yüzünden zarfında zaten zati zira -çabuk -çabukça -çeşitli -çok -çokları -çoklarınca -çokluk -çoklukla -çokça -çoğu -çoğun -çoğunca -çoğunlukla -çünkü -öbür -öbürkü -öbürü -önce -önceden -önceleri -öncelikle -öteki -ötekisi -öyle -öylece -öylelikle -öylemesine -öz -üzere -üç -þey -þeyden -þeyi -þeyler -þu -þuna -þunda -þundan -þunu -şayet -şey -şeyden -şeyi -şeyler -şu -şuna -şuncacık -şunda -şundan -şunlar -şunları -şunu -şunun -şura -şuracık -şuracıkta -şurası -şöyle -ţayet -ţimdi -ţu -ţöyle """.split()) diff --git a/spacy/lang/tr/tokenizer_exceptions.py b/spacy/lang/tr/tokenizer_exceptions.py index c945c0058..524873aa9 100644 --- a/spacy/lang/tr/tokenizer_exceptions.py +++ b/spacy/lang/tr/tokenizer_exceptions.py @@ -3,11 +3,6 @@ from __future__ import unicode_literals from ...symbols import ORTH, NORM - -# These exceptions are mostly for example purposes – hoping that Turkish -# speakers can contribute in the future! Source of copy-pasted examples: -# https://en.wiktionary.org/wiki/Category:Turkish_language - _exc = { "sağol": [ {ORTH: "sağ"}, @@ -16,11 +11,112 @@ _exc = { for exc_data in [ - {ORTH: "A.B.D.", NORM: "Amerika Birleşik Devletleri"}]: + {ORTH: "A.B.D.", NORM: "Amerika Birleşik Devletleri"}, + {ORTH: "Alb.", NORM: "Albay"}, + {ORTH: "Ar.Gör.", NORM: "Araştırma Görevlisi"}, + {ORTH: "Arş.Gör.", NORM: "Araştırma Görevlisi"}, + {ORTH: "Asb.", NORM: "Astsubay"}, + {ORTH: "Astsb.", NORM: "Astsubay"}, + {ORTH: "As.İz.", NORM: "Askeri İnzibat"}, + {ORTH: "Atğm", NORM: "Asteğmen"}, + {ORTH: "Av.", NORM: "Avukat"}, + {ORTH: "Apt.", NORM: "Apartmanı"}, + {ORTH: "Bçvş.", NORM: "Başçavuş"}, + {ORTH: "bk.", NORM: "bakınız"}, + {ORTH: "bknz.", NORM: "bakınız"}, + {ORTH: "Bnb.", NORM: "Binbaşı"}, + {ORTH: "bnb.", NORM: "binbaşı"}, + {ORTH: "Böl.", NORM: "Bölümü"}, + {ORTH: "Bşk.", NORM: "Başkanlığı"}, + {ORTH: "Bştbp.", NORM: "Baştabip"}, + {ORTH: "Bul.", NORM: "Bulvarı"}, + {ORTH: "Cad.", NORM: "Caddesi"}, + {ORTH: "çev.", NORM: "çeviren"}, + {ORTH: "Çvş.", NORM: "Çavuş"}, + {ORTH: "dak.", NORM: "dakika"}, + {ORTH: "dk.", NORM: "dakika"}, + {ORTH: "Doç.", NORM: "Doçent"}, + {ORTH: "doğ.", NORM: "doğum tarihi"}, + {ORTH: "drl.", NORM: "derleyen"}, + {ORTH: "Dz.", NORM: "Deniz"}, + {ORTH: "Dz.K.K.lığı", NORM: "Deniz Kuvvetleri Komutanlığı"}, + {ORTH: "Dz.Kuv.", NORM: "Deniz Kuvvetleri"}, + {ORTH: "Dz.Kuv.K.", NORM: "Deniz Kuvvetleri Komutanlığı"}, + {ORTH: "dzl.", NORM: "düzenleyen"}, + {ORTH: "Ecz.", NORM: "Eczanesi"}, + {ORTH: "ekon.", NORM: "ekonomi"}, + {ORTH: "Fak.", NORM: "Fakültesi"}, + {ORTH: "Gn.", NORM: "Genel"}, + {ORTH: "Gnkur.", NORM: "Genelkurmay"}, + {ORTH: "Gn.Kur.", NORM: "Genelkurmay"}, + {ORTH: "gr.", NORM: "gram"}, + {ORTH: "Hst.", NORM: "Hastanesi"}, + {ORTH: "Hs.Uzm.", NORM: "Hesap Uzmanı"}, + {ORTH: "huk.", NORM: "hukuk"}, + {ORTH: "Hv.", NORM: "Hava"}, + {ORTH: "Hv.K.K.lığı", NORM: "Hava Kuvvetleri Komutanlığı"}, + {ORTH: "Hv.Kuv.", NORM: "Hava Kuvvetleri"}, + {ORTH: "Hv.Kuv.K.", NORM: "Hava Kuvvetleri Komutanlığı"}, + {ORTH: "Hz.", NORM: "Hazreti"}, + {ORTH: "Hz.Öz.", NORM: "Hizmete Özel"}, + {ORTH: "İng.", NORM: "İngilizce"}, + {ORTH: "Jeol.", NORM: "Jeoloji"}, + {ORTH: "jeol.", NORM: "jeoloji"}, + {ORTH: "Korg.", NORM: "Korgeneral"}, + {ORTH: "Kur.", NORM: "Kurmay"}, + {ORTH: "Kur.Bşk.", NORM: "Kurmay Başkanı"}, + {ORTH: "Kuv.", NORM: "Kuvvetleri"}, + {ORTH: "Ltd.", NORM: "Limited"}, + {ORTH: "Mah.", NORM: "Mahallesi"}, + {ORTH: "mah.", NORM: "mahallesi"}, + {ORTH: "max.", NORM: "maksimum"}, + {ORTH: "min.", NORM: "minimum"}, + {ORTH: "Müh.", NORM: "Mühendisliği"}, + {ORTH: "müh.", NORM: "mühendisliği"}, + {ORTH: "MÖ.", NORM: "Milattan Önce"}, + {ORTH: "Onb.", NORM: "Onbaşı"}, + {ORTH: "Ord.", NORM: "Ordinaryüs"}, + {ORTH: "Org.", NORM: "Orgeneral"}, + {ORTH: "Ped.", NORM: "Pedagoji"}, + {ORTH: "Prof.", NORM: "Profesör"}, + {ORTH: "Sb.", NORM: "Subay"}, + {ORTH: "Sn.", NORM: "Sayın"}, + {ORTH: "sn.", NORM: "saniye"}, + {ORTH: "Sok.", NORM: "Sokak"}, + {ORTH: "Şb.", NORM: "Şube"}, + {ORTH: "Şti.", NORM: "Şirketi"}, + {ORTH: "Tbp.", NORM: "Tabip"}, + {ORTH: "T.C.", NORM: "Türkiye Cumhuriyeti"}, + {ORTH: "Tel.", NORM: "Telefon"}, + {ORTH: "tel.", NORM: "telefon"}, + {ORTH: "telg.", NORM: "telgraf"}, + {ORTH: "Tğm.", NORM: "Teğmen"}, + {ORTH: "tğm.", NORM: "teğmen"}, + {ORTH: "tic.", NORM: "ticaret"}, + {ORTH: "Tug.", NORM: "Tugay"}, + {ORTH: "Tuğg.", NORM: "Tuğgeneral"}, + {ORTH: "Tümg.", NORM: "Tümgeneral"}, + {ORTH: "Uzm.", NORM: "Uzman"}, + {ORTH: "Üçvş.", NORM: "Üstçavuş"}, + {ORTH: "Üni.", NORM: "Üniversitesi"}, + {ORTH: "Ütğm.", NORM: "Üsteğmen"}, + {ORTH: "vb.", NORM: "ve benzeri"}, + {ORTH: "vs.", NORM: "vesaire"}, + {ORTH: "Yard.", NORM: "Yardımcı"}, + {ORTH: "Yar.", NORM: "Yardımcı"}, + {ORTH: "Yd.Sb.", NORM: "Yedek Subay"}, + {ORTH: "Yard.Doç.", NORM: "Yardımcı Doçent"}, + {ORTH: "Yar.Doç.", NORM: "Yardımcı Doçent"}, + {ORTH: "Yb.", NORM: "Yarbay"}, + {ORTH: "Yrd.", NORM: "Yardımcı"}, + {ORTH: "Yrd.Doç.", NORM: "Yardımcı Doçent"}, + {ORTH: "Y.Müh.", NORM: "Yüksek mühendis"}, + {ORTH: "Y.Mim.", NORM: "Yüksek mimar"}]: _exc[exc_data[ORTH]] = [exc_data] -for orth in ["Dr."]: +for orth in [ + "Dr.", "yy."]: _exc[orth] = [{ORTH: orth}] diff --git a/website/api/_top-level/_displacy.jade b/website/api/_top-level/_displacy.jade index a3d7240d6..105bb0cc6 100644 --- a/website/api/_top-level/_displacy.jade +++ b/website/api/_top-level/_displacy.jade @@ -208,7 +208,7 @@ p +row +cell #[code word_spacing] +cell int - +cell Horizontal spacing between words and arcs in px. + +cell Vertical spacing between words and arcs in px. +cell #[code 45] +row diff --git a/website/api/doc.jade b/website/api/doc.jade index 7dc5e9842..9a0b3253d 100644 --- a/website/api/doc.jade +++ b/website/api/doc.jade @@ -674,7 +674,7 @@ p | token vectors. +aside-code("Example"). - apples = nlp(u'I like apples') + doc = nlp(u'I like apples') assert doc.vector.dtype == 'float32' assert doc.vector.shape == (300,) diff --git a/website/api/goldcorpus.jade b/website/api/goldcorpus.jade index 0f7105f65..5609c2530 100644 --- a/website/api/goldcorpus.jade +++ b/website/api/goldcorpus.jade @@ -12,11 +12,24 @@ p Create a #[code GoldCorpus]. +table(["Name", "Type", "Description"]) +row - +cell #[code train_path] - +cell unicode or #[code Path] - +cell File or directory of training data. + +cell #[code train] + +cell unicode or #[code Path] or iterable + +cell + | Training data, as a path (file or directory) or iterable. If an + | iterable, each item should be a #[code (text, paragraphs)] + | tuple, where each paragraph is a tuple + | #[code.u-break (sentences, brackets)],and each sentence is a + | tuple #[code.u-break (ids, words, tags, heads, ner)]. See the + | implementation of + | #[+src(gh("spacy", "spacy/gold.pyx")) #[code gold.read_json_file]] + | for further details. +row - +cell #[code dev_path] - +cell unicode or #[code Path] - +cell File or directory of development data. + +cell #[code dev] + +cell unicode or #[code Path] or iterable + +cell Development data, as a path (file or directory) or iterable. + + +row("foot") + +cell returns + +cell #[code GoldCorpus] + +cell The newly constructed object. diff --git a/website/api/lexeme.jade b/website/api/lexeme.jade index 86fa18730..b1e63d378 100644 --- a/website/api/lexeme.jade +++ b/website/api/lexeme.jade @@ -325,6 +325,12 @@ p The L2 norm of the lexeme's vector representation. +cell bool +cell Is the lexeme a quotation mark? + +row + +cell #[code is_currency] + +tag-new("2.0.8") + +cell bool + +cell Is the lexeme a currency symbol? + +row +cell #[code like_url] +cell bool diff --git a/website/api/matcher.jade b/website/api/matcher.jade index 2418dd2fa..00260a109 100644 --- a/website/api/matcher.jade +++ b/website/api/matcher.jade @@ -111,6 +111,25 @@ p Match a stream of documents, yielding them in turn. | parallel, if the #[code Matcher] implementation supports | multi-threading. + +row + +cell #[code return_matches] + +tag-new(2.1) + +cell bool + +cell + | Yield the match lists along with the docs, making results + | #[code (doc, matches)] tuples. + + +row + +cell #[code as_tuples] + +tag-new(2.1) + +cell bool + +cell + | Interpret the input stream as #[code (doc, context)] tuples, and + | yield #[code (result, context)] tuples out. If both + | #[code return_matches] and #[code as_tuples] are #[code True], + | the output will be a sequence of + | #[code ((doc, matches), context)] tuples. + +row("foot") +cell yields +cell #[code Doc] diff --git a/website/api/pipe.jade b/website/api/pipe.jade index 3d4dc5563..c0ec86972 100644 --- a/website/api/pipe.jade +++ b/website/api/pipe.jade @@ -209,7 +209,7 @@ p +row +cell #[code drop] - +cell int + +cell float +cell The dropout rate. +row diff --git a/website/api/token.jade b/website/api/token.jade index 65e687cde..ca237acc6 100644 --- a/website/api/token.jade +++ b/website/api/token.jade @@ -740,6 +740,12 @@ p The L2 norm of the token's vector representation. +cell bool +cell Is the token a quotation mark? + +row + +cell #[code is_currency] + +tag-new("2.0.8") + +cell bool + +cell Is the token a currency symbol? + +row +cell #[code like_url] +cell bool diff --git a/website/assets/img/social/preview_alpha.jpg b/website/assets/img/social/preview_alpha.jpg deleted file mode 100644 index 821db408a..000000000 Binary files a/website/assets/img/social/preview_alpha.jpg and /dev/null differ diff --git a/website/models/_data.json b/website/models/_data.json index 5a494729c..5f59d5b78 100644 --- a/website/models/_data.json +++ b/website/models/_data.json @@ -76,13 +76,15 @@ }, "MODEL_LICENSES": { - "CC BY-SA": "https://creativecommons.org/licenses/by-sa/3.0/", - "CC BY-SA 3.0": "https://creativecommons.org/licenses/by-sa/3.0/", - "CC BY-SA 4.0": "https://creativecommons.org/licenses/by-sa/4.0/", - "CC BY-NC": "https://creativecommons.org/licenses/by-nc/3.0/", - "CC BY-NC 3.0": "https://creativecommons.org/licenses/by-nc/3.0/", - "GPL": "https://www.gnu.org/licenses/gpl.html", - "LGPL": "https://www.gnu.org/licenses/lgpl.html" + "CC BY 4.0": "https://creativecommons.org/licenses/by/4.0/", + "CC BY-SA": "https://creativecommons.org/licenses/by-sa/3.0/", + "CC BY-SA 3.0": "https://creativecommons.org/licenses/by-sa/3.0/", + "CC BY-SA 4.0": "https://creativecommons.org/licenses/by-sa/4.0/", + "CC BY-NC": "https://creativecommons.org/licenses/by-nc/3.0/", + "CC BY-NC 3.0": "https://creativecommons.org/licenses/by-nc/3.0/", + "CC-BY-NC-SA 3.0": "https://creativecommons.org/licenses/by-nc-sa/3.0/", + "GPL": "https://www.gnu.org/licenses/gpl.html", + "LGPL": "https://www.gnu.org/licenses/lgpl.html" }, "MODEL_BENCHMARKS": { diff --git a/website/usage/_processing-pipelines/_pipelines.jade b/website/usage/_processing-pipelines/_pipelines.jade index e0df8babe..06f420fe8 100644 --- a/website/usage/_processing-pipelines/_pipelines.jade +++ b/website/usage/_processing-pipelines/_pipelines.jade @@ -40,7 +40,7 @@ p +item | Make the #[strong model data] available to the #[code Language] class | by calling #[+api("language#from_disk") #[code from_disk]] with the - | path to the model data ditectory. + | path to the model data directory. p | So when you call this... @@ -53,7 +53,7 @@ p | pipeline #[code.u-break ["tagger", "parser", "ner"]]. spaCy will then | initialise #[code spacy.lang.en.English], and create each pipeline | component and add it to the processing pipeline. It'll then load in the - | model's data from its data ditectory and return the modified + | model's data from its data directory and return the modified | #[code Language] class for you to use as the #[code nlp] object. p diff --git a/website/usage/_spacy-101/_similarity.jade b/website/usage/_spacy-101/_similarity.jade index 1cf761179..ad9ac21bd 100644 --- a/website/usage/_spacy-101/_similarity.jade +++ b/website/usage/_spacy-101/_similarity.jade @@ -37,7 +37,7 @@ p +cell.u-text-label.u-color-theme=label for cell in cells +cell.u-text-center - - var result = cell > 0.5 ? ["yes", "similar"] : cell != 1 ? ["no", "dissimilar"] : ["neutral", "identical"] + - var result = cell < 0.5 ? ["no", "dissimilar"] : cell != 1 ? ["yes", "similar"] : ["neutral", "identical"] | #[code=cell.toFixed(2)] #[+procon(...result)] p diff --git a/website/usage/_v2/_features.jade b/website/usage/_v2/_features.jade index 2c172e437..fc10d7953 100644 --- a/website/usage/_v2/_features.jade +++ b/website/usage/_v2/_features.jade @@ -163,7 +163,7 @@ p nlp = English().from_disk('/path/to/nlp') p - | spay's serialization API has been made consistent across classes and + | spaCy's serialization API has been made consistent across classes and | objects. All container classes, i.e. #[code Language], #[code Doc], | #[code Vocab] and #[code StringStore] now have a #[code to_bytes()], | #[code from_bytes()], #[code to_disk()] and #[code from_disk()] method diff --git a/website/usage/resources.jade b/website/usage/resources.jade index 8766d3864..4b29a7831 100644 --- a/website/usage/resources.jade +++ b/website/usage/resources.jade @@ -120,6 +120,9 @@ include ../_includes/_mixins | A Practical Real-World Approach to Gaining Actionable Insights | from your Data + +card("Practical Machine Learning with Python", "", "Dipanjan Sarkar et al. (Apress, 2017)", "book") + | A Problem-Solver's Guide to Building Real-World Intelligent Systems + +section("notebooks") +h(2, "notebooks") Jupyter notebooks diff --git a/website/usage/spacy-101.jade b/website/usage/spacy-101.jade index 81ed7a133..d5f4881e8 100644 --- a/website/usage/spacy-101.jade +++ b/website/usage/spacy-101.jade @@ -68,7 +68,7 @@ p +item #[strong spaCy is not research software]. | It's built on the latest research, but it's designed to get | things done. This leads to fairly different design decisions than - | #[+a("https://github./nltk/nltk") NLTK] + | #[+a("https://github.com/nltk/nltk") NLTK] | or #[+a("https://stanfordnlp.github.io/CoreNLP/") CoreNLP], which were | created as platforms for teaching and research. The main difference | is that spaCy is integrated and opinionated. spaCy tries to avoid asking