Commit Graph

2819 Commits

Author SHA1 Message Date
Bram Vanroy
1e217d71b8 Update to spacy_conll in universe (#10617)
* update to spacy_conll

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-04-04 18:02:07 +02:00
Adriane Boyd
1853606cc8 Add spancat, trainable_lemmatizer to quickstart (#10524)
* Add `SPACY` and `IS_SPACE` as default `tok2vec` features
2022-04-04 18:00:28 +02:00
Adriane Boyd
6923c5eff4 Add NORM to Matcher feature in docs (#10560) 2022-03-28 10:41:05 +02:00
Adriane Boyd
ccaba3c459 Remove now-built-in jinja2>=3.1.0 extensions 2022-03-25 14:26:08 +01:00
David Berenstein
b982aeb932 added Concise Concepts to spaCy universe (#10499)
* Update universe.json

added classy-classification to Spacy universe

* Update universe.json

added classy-classification to the spacy universe resources

* Update universe.json

corrected a small typo in json

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update universe.json

processed merge feedback

* Update universe.json

* updated information for Classy Classificaiton 

Made a more comprehensible and easy description for Classy Classification based on feedback of Philip Vollet to prepare for sharing.

* added note about examples

* corrected for wrong formatting changes

* Update website/meta/universe.json with small typo correction

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* resolved another typo

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* added Concise Concepts package to spaCy universe.

* updated example code Concise Concepts

* updated description for Concise Concepts

* updated PR with more visually appealing examples

SO to koaning for the suggestions.

* corrected for small json typo's in concise concepts

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-03-24 18:01:32 +01:00
Basile Dura
73c31b30b4 docs: add EDS-NLP to spaCy universe (#10489)
* docs: add EDS-NLP to spaCy universe

* fix: remove "standalone" tag for EDS-NLP

Co-authored-by: Basile Dura <basile.dura-ext@aphp.fr>
2022-03-21 11:04:59 +01:00
Lj Miranda
4996fc5c41 Fix mixed-up parameters for spacy-conll (#10516) 2022-03-18 08:57:01 +01:00
David Berenstein
8e5773b752 Updated explenation for for classy classification (#10484)
* Update universe.json

added classy-classification to Spacy universe

* Update universe.json

added classy-classification to the spacy universe resources

* Update universe.json

corrected a small typo in json

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update universe.json

processed merge feedback

* Update universe.json

* updated information for Classy Classificaiton 

Made a more comprehensible and easy description for Classy Classification based on feedback of Philip Vollet to prepare for sharing.

* added note about examples

* corrected for wrong formatting changes

* Update website/meta/universe.json with small typo correction

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* resolved another typo

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-03-15 16:43:13 +01:00
Adriane Boyd
142f9ec89b Various install docs updates (#10487)
* Simplify quickstart source install to use only editable pip install

* Update pytorch install instructions to more recent versions
2022-03-15 11:14:58 +01:00
vincent d warmerdam
843a501312 Update universe.json (#10490)
The project moved away from Rasa and into my personal GitHub account.
2022-03-15 11:14:40 +01:00
Adriane Boyd
33e43d5b96 Update docs for Vocab.get_vector (#10486)
* Update docs for Vocab.get_vector

* Clarify description of 0-vector dimensions
2022-03-15 09:11:14 +01:00
Peter Baumgartner
489336171a Add path.mkdir to custom component examples of to_disk (#10348)
* add `path.mkdir` to examples

* add ensure_path + mkdir

* update highlights
2022-03-08 16:05:22 +01:00
Adriane Boyd
7cd2228703 Fix types in API docs for moves in parser and ner (#10464) 2022-03-08 13:52:10 +01:00
David Berenstein
98ebf50e84 added classy-classification package to spacy universe (#10393)
* Update universe.json

added classy-classification to Spacy universe

* Update universe.json

added classy-classification to the spacy universe resources

* Update universe.json

corrected a small typo in json

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update universe.json

processed merge feedback

* Update universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-03-07 12:48:27 +01:00
Sofie Van Landeghem
56cb167e84 Clean up loggers docs (#10351)
* update docs to point to spacy-loggers docs

* remove unused error code
2022-02-25 16:30:24 +01:00
Sam Edwardes
30cb5ab5ad Updated spaCy universe for spacytextblob (#10335)
* Updated spacytextblob in universe.json

* Fixed json

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Added spacy_version tag to spacytextblob

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-02-24 14:18:58 +09:00
Paul O'Leary McCann
14fc2cc9e3 Add tmtoolkit setup steps 2022-02-14 15:18:31 +09:00
Markus Konrad
dd6a9e5141 add tmtoolkit package to spaCy universe (#10245) 2022-02-14 15:18:13 +09:00
John Boy
10c77af83d
add textnets to spaCy universe (#10216)
https://github.com/jboynyc/textnets/issues/38
2022-02-09 15:04:26 +09:00
Ines Montani
7b883da9fd
Merge pull request #10239 from explosion/docs/spacy-tailored-pipelines [ci skip] 2022-02-08 18:04:01 +01:00
Ines Montani
f2c2b97e56 Add spaCy Tailored Pipelines 2022-02-08 11:46:42 +01:00
Sofie Van Landeghem
deb143fa70
Token sent attributes more consistent (#10164)
* remove duplicate line

* add sent start/end token attributes to the docs

* let has_annotation work with IS_SENT_END

* elif instead of if

* add has_annotation test for sent attributes

* fix typo

* remove duplicate is_sent_start entry in docs
2022-02-08 08:35:37 +01:00
Peter Baumgartner
836f689cc7
YAML multiline tip for project.yml files (#10187)
* MultiHashEmbed vector docs correction

* add in multi-line tip

* convert to sidebar tip
2022-02-08 08:35:09 +01:00
Kenneth Enevoldsen
e4625d2fc3
Added Augmenty to universe (#10229)
* Added Augmenty to universe

* Update website/meta/universe.json

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-02-08 08:32:11 +01:00
Lj Miranda
72fece712f
Add shuffle parameter to Corpus API docs (#10220)
* Add shuffle parameter to Corpus API docs

* Update website/docs/api/corpus.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-02-07 14:55:53 +01:00
Kenneth Enevoldsen
a2f27ff83a
Added spacy-wrap to universe (#10168)
* Added spacy-wrap to universe 

Added spacy-wrap to universe a small package for wrapping fine-tuned huggingface transformers to a spacy pipeline following the same API as spacy-transformers. (Currently limited to classification models)

* Update website/meta/universe.json

* Update website/meta/universe.json

* Update website/meta/universe.json

* Update website/meta/universe.json

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-02-03 12:30:09 +01:00
Lj Miranda
345e7f6bc4
Clarify Span.ents documentation (#10154)
* Clarify Span.ents documentation

Ref: #10135

Retain current behaviour. Span.ents will only include entities within
said span. You can't get tokens outside of the original span.

* Reword docstrings

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update API docs in the website

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-01-31 08:41:42 +01:00
Adriane Boyd
4f441dfa24
Fix infix as prefix in Tokenizer.explain (#10140)
* Fix infix as prefix in Tokenizer.explain

Update `Tokenizer.explain` to align with the `Tokenizer` algorithm:

* skip infix matches that are prefixes in the current substring

* Update tokenizer pseudocode in docs
2022-01-28 17:00:54 +01:00
Ines Montani
34ed93ef68
Support version tags in universe and add note about reporting (#10093)
* Support version tags in universe and add note about reporting

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-01-20 23:21:26 +01:00
Peter Baumgartner
a69005037a
Docker Image for Website Dev (#10098)
* add docker instructions

* Update website/README.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/README.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* clarifying language on docker image

* fix markdown formatting

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-01-20 23:02:13 +01:00
Duygu Altinok
268ddf8a06
Add ENT_IOB key to Matcher (#9649)
* added new field

* added exception for IOb strings

* minor refinement to schema

* removed field

* fixed typo

* imported numeriacla val

* changed the code bit

* cosmetics

* added test for matcher

* set ents of moc docs

* added invalid pattern

* minor update to documentation

* blacked matcher

* added pattern validation

* add IOB vals to schema

* changed into test

* mypy compat

* cleaned left over

* added compat import

* changed type

* added compat import

* changed literal a bit

* went back to old

* made explicit type

* Update spacy/schemas.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update spacy/schemas.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update spacy/schemas.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-01-20 13:18:39 +01:00
Adriane Boyd
7d528e607c
Update quickstart install steps (#10092)
* For conda:
  * Use conda environment rather than venv
  * Install `spacy-transformers` as a conda package
* For pip:
  * Add quotes if extras are included
2022-01-20 10:53:40 +01:00
Paul O'Leary McCann
2ff53834bb
Add link to pattern file info in EntityRuler.initialize docs (#10091)
* Add link to pattern file info in EntityRuler.initialize docs

* Update website/docs/api/entityruler.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-01-19 10:45:11 +01:00
Daniël de Kok
50d2a2c930
User fewer Vector internals (#9879)
* Use Vectors.shape rather than Vectors.data.shape

* Use Vectors.size rather than Vectors.data.size

* Add Vectors.to_ops to move data between different ops

* Add documentation for Vector.to_ops
2022-01-18 17:14:35 +01:00
Tuomo Hiippala
6a8619dd73
Update the entry for Applied Language Technology in spaCy Universe (#10068)
* add entry for Applied Language Technology under "Courses"

Added the following entry into `universe.json`:

```
        {
            "type": "education",
            "id": "applt-course",
            "title": "Applied Language Technology",
            "slogan": "NLP for newcomers using spaCy and Stanza",
            "description": "These learning materials provide an introduction to applied language technology for audiences who are unfamiliar with language technology and programming. The learning materials assume no previous knowledge of the Python programming language.",
            "url": "https://applied-language-technology.readthedocs.io/",
            "image": "https://www.mv.helsinki.fi/home/thiippal/images/applt-preview.jpg",
            "thumb": "https://applied-language-technology.readthedocs.io/en/latest/_static/logo.png",
            "author": "Tuomo Hiippala",
            "author_links": {
                "twitter": "tuomo_h",
                "github": "thiippal",
                "website": "https://www.mv.helsinki.fi/home/thiippal/"
            },
            "category": ["courses"]
        },
```

* Update the entry for "Applied Language Technology"
2022-01-17 08:28:51 +01:00
ColleterVi
a784b12eff
fix: new restcountries url (#10043)
Url extension "eu" and path "rest" are no longer available. Replacing them for a working url.
2022-01-13 20:25:06 +09:00
Ines Montani
a437ca6737 Update website to use new Algolia search API 2022-01-05 13:21:06 +01:00
Sofie Van Landeghem
56dcb39fb7
Fix references to config file in the docs & UX (#9961)
* doc fixes around config file

* fix typo

* clarify default
2022-01-04 14:31:26 +01:00
Sam Edwardes
6f65e2b544
Added spacypdfreader to universe.json (#9963) 2022-01-03 16:34:36 +09:00
Paul O'Leary McCann
f40e237c5a
Remove denomme from universe (#9952)
Package seems to have been deleted.
2021-12-29 11:41:29 +01:00
Yoav Vollansky
9d63dfacfc
Update UNIVERSE.md (#9941)
typo
2021-12-27 13:46:04 +01:00
Peter Baumgartner
72abf9e102
MultiHashEmbed vector docs correction (#9918) 2021-12-27 11:18:08 +01:00
Edward
018827e9fd Add healthsea to universe (#9838)
* Add healthsea to universe

* Update website/meta/universe.json

* Add thumbnail

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-12-15 17:57:19 +01:00
Ines Montani
ba0fa7a64e
Support Google Sheets embeds in docs (#9861) 2021-12-15 09:27:08 +01:00
Adriane Boyd
51a3b60027
Document Tagger neg_prefix, fix typo (#9821) 2021-12-07 09:42:40 +01:00
Duygu Altinok
b56b9e7f31
Entity ruler remove pattern (#9685)
* added ruler coe

* added error for none existing pattern

* changed error to warning

* changed error to warning

* added basic tests

* fixed place

* added test files

* went back to error

* went back to pattern error

* minor change to docs

* changed style

* changed doc

* changed error slightly

* added remove to phrasem api

* error key already existed

* phrase matcher match code to api

* blacked tests

* moved comments before expr

* corrected error no

* Update website/docs/api/entityruler.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/docs/api/entityruler.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-12-06 15:32:49 +01:00
Natalia Rodnova
472740d613
Added sents property to Span for Spans spanning over several sentences (#9699)
* Added sents property to Span class that returns a generator of sentences the Span belongs to

* Added description to Span.sents property

* Update test_span to clarify the difference between span.sent and span.sents

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update spacy/tests/doc/test_span.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Fix documentation typos in spacy/tokens/span.pyx

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update Span.sents doc string in spacy/tokens/span.pyx

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Parametrized test_span_spans

* Corrected Span.sents to check for span-level hook first. Also, made Span.sent respect doc-level sents hook if no span-level hook is provided

* Corrected Span ocumentation copy/paste issue

* Put back accidentally deleted lines

* Fixed formatting in span.pyx

* Moved check for SENT_START annotation after user hooks in Span.sents

* add version where the property was introduced

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-12-06 09:58:01 +01:00
Narayan Acharya
1be8a4dab3
Displacy serve entity linking support without manual=True support. (#9748)
* Add support for kb_id to be displayed via displacy.serve. The current support is only limited to the manual option in displacy.render

* Commit to check pre-commit hooks are run.

* Update spacy/displacy/__init__.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Changes as per suggestions on the PR.

* Update website/docs/api/top-level.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/docs/api/top-level.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* tag option as new from 3.2.1 onwards

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>
2021-11-29 17:13:26 +01:00
Adriane Boyd
6763cbfdc0
Update Catalan acknowledgements for v3.2 (#9763) 2021-11-29 14:14:21 +01:00
Tuomo Hiippala
5c44533263
add entry for Applied Language Technology under "Courses" (#9755)
Added the following entry into `universe.json`:

```
        {
            "type": "education",
            "id": "applt-course",
            "title": "Applied Language Technology",
            "slogan": "NLP for newcomers using spaCy and Stanza",
            "description": "These learning materials provide an introduction to applied language technology for audiences who are unfamiliar with language technology and programming. The learning materials assume no previous knowledge of the Python programming language.",
            "url": "https://applied-language-technology.readthedocs.io/",
            "image": "https://www.mv.helsinki.fi/home/thiippal/images/applt-preview.jpg",
            "thumb": "https://applied-language-technology.readthedocs.io/en/latest/_static/logo.png",
            "author": "Tuomo Hiippala",
            "author_links": {
                "twitter": "tuomo_h",
                "github": "thiippal",
                "website": "https://www.mv.helsinki.fi/home/thiippal/"
            },
            "category": ["courses"]
        },
```
2021-11-28 19:33:16 +09:00