Commit Graph

441 Commits

Author SHA1 Message Date
schaeran
b94bcaa62f remove universe object: spacy-vis 2022-07-04 16:04:29 +02:00
schaeran
880e7db44e remove universe object: spacy_grammar 2022-07-04 16:04:06 +02:00
schaeran
6c036d1e25 remove universe object: spacy_hunspell 2022-07-04 16:03:30 +02:00
Dmytro Sadovnychyi
4cd8b4cc22
Fix some of the broken links on universe pages (#11011)
Currently some of the "AUTHOR INFO" links (e.g. here[0]) are broken:

```
https://github.com/https://github.com/explosion
```

[0] https://spacy.io/universe/project/spacy-experimental


Also one remains broken with `https://szegedai.github.io/`.
2022-06-23 17:53:00 +02:00
Adriane Boyd
f1197d9175
Add API docs for token attribute symbols (#10836)
* Add API docs for token attribute symbols

* Remove NBSP's

* Fix typo

* Rephrase

Co-authored-by: svlandeg <svlandeg@github.com>
2022-06-23 08:16:38 +02:00
Lucaterre
2820d7dd8d correct typo in universe.json for 'code_example' key : pipe name 'entityfishing' 2022-06-20 15:26:23 +02:00
Lucaterre
cdad815c68 updated spacy universe for spacyfishing 2022-06-20 14:28:49 +02:00
Gor Arakelyan
605f84938b
Add "Aim-spaCy" to spaCy Universe (#10943)
* Add Aim-spaCy to spaCy universe

* Update Aim thumbnail

* Fix author links

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
2022-06-10 18:33:17 +09:00
vincent d warmerdam
e7d2b26966
Add spacy-report to universe (#10910)
* Add spacy-report to universe

* Remove extra comma

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
2022-06-05 18:57:58 +09:00
Adriane Boyd
a322d6d5f2
Add SpanRuler component (#9880)
* Add SpanRuler component

Add a `SpanRuler` component similar to `EntityRuler` that saves a list
of matched spans to `Doc.spans[spans_key]`. The matches from the token
and phrase matchers are deduplicated and sorted before assignment but
are not otherwise filtered.

* Update spacy/pipeline/span_ruler.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Fix cast

* Add self.key property

* Use number of patterns as length

* Remove patterns kwarg from init

* Update spacy/tests/pipeline/test_span_ruler.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Add options for spans filter and setting to ents

* Add `spans_filter` option as a registered function'
* Make `spans_key` optional and if `None`, set to `doc.ents` instead of
`doc.spans[spans_key]`.

* Update and generalize tests

* Add test for setting doc.ents, fix key property type

* Fix typing

* Allow independent doc.spans and doc.ents

* If `spans_key` is set, set `doc.spans` with `spans_filter`.
* If `annotate_ents` is set, set `doc.ents` with `ents_fitler`.
  * Use `util.filter_spans` by default as `ents_filter`.
  * Use a custom warning if the filter does not work for `doc.ents`.

* Enable use of SpanC.id in Span

* Support id in SpanRuler as Span.id

* Update types

* `id` can only be provided as string (already by `PatternType`
definition)

* Update all uses of Span.id/ent_id in Doc

* Rename Span id kwarg to span_id

* Update types and docs

* Add ents filter to mimic EntityRuler overwrite_ents

* Refactor `ents_filter` to take `entities, spans` args for more
  filtering options
* Give registered filters more descriptive names
* Allow registered `filter_spans` filter
  (`spacy.first_longest_spans_filter.v1`) to take any number of
  `Iterable[Span]` objects as args so it can be used for spans filter
  or ents filter

* Implement future entity ruler as span ruler

Implement a compatible `entity_ruler` as `future_entity_ruler` using
`SpanRuler` as the underlying component:
* Add `sort_key` and `sort_reverse` to allow the sorting behavior to be
  customized. (Necessary for the same sorting/filtering as in
  `EntityRuler`.)
* Implement `overwrite_overlapping_ents_filter` and
  `preserve_existing_ents_filter` to support
  `EntityRuler.overwrite_ents` settings.
* Add `remove_by_id` to support `EntityRuler.remove` functionality.
* Refactor `entity_ruler` tests to parametrize all tests to test both
  `entity_ruler` and `future_entity_ruler`
* Implement `SpanRuler.token_patterns` and `SpanRuler.phrase_patterns`
  properties.

Additional changes:

* Move all config settings to top-level attributes to avoid duplicating
  settings in the config vs. `span_ruler/cfg`. (Also avoids a lot of
  casting.)

* Format

* Fix filter make method name

* Refactor to use same error for removing by label or ID

* Also provide existing spans to spans filter

* Support ids property

* Remove token_patterns and phrase_patterns

* Update docstrings

* Add span ruler docs

* Fix types

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Move sorting into filters

* Check for all tokens in seen tokens in entity ruler filters

* Remove registered sort key

* Set Token.ent_id in a backwards-compatible way in Doc.set_ents

* Remove sort options from API docs

* Update docstrings

* Rename entity ruler filters

* Fix and parameterize scoring

* Add id to Span API docs

* Fix typo in API docs

* Include explicit labeled=True for scorer

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-06-02 13:12:53 +02:00
richardpaulhudson
d4218366c5 Update Holmes entry in universe.json 2022-05-30 18:05:26 +02:00
schaeran
f5952c0851 update spaCy Universe: spacytextblob (code example) 2022-05-12 18:23:00 +02:00
Richard Hudson
c32e1a0079
Updated Coreferee Universe entry (#10763) 2022-05-06 13:21:39 +02:00
vincent d warmerdam
f3de976513
Update universe.json to Include spaCy video #6 (#10723)
* Update universe.json

I noticed that episode 6 was missing, so I added it.

* Update universe.json

* Update universe.json
2022-05-02 13:35:14 +02:00
Adriane Boyd
497a708c71
Docs for v3.3 (#10628)
* Temporarily disable CI tests

* Start v3.3 website updates

* Add trainable lemmatizer to pipeline design

* Fix Vectors.most_similar

* Add floret vector info to pipeline design

* Add Lower and Upper Sorbian

* Add span to sidebar

* Work on release notes

* Copy from release notes

* Update pipeline design graphic

* Upgrading note about Doc.from_docs

* Add tables and details

* Update website/docs/models/index.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Fix da lemma acc

* Add minimal intro, various updates

* Round lemma acc

* Add section on floret / word lists

* Add new pipelines table, minor edits

* Fix displacy spans example title

* Clarify adding non-trainable lemmatizer

* Update adding-languages URLs

* Revert "Temporarily disable CI tests"

This reverts commit 1dee505920.

* Spell out words/sec

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-04-28 14:09:35 +02:00
Mike
3b208197c3
Fixed example for spacy_syllables (#10705)
There was a typo in the example for the spacy_syllables project.
2022-04-25 16:40:54 +02:00
Schero1994
d622883a42
Adding and updating content in the spacy universe (#10493)
* signing contributor agreement

* adding new content to the spaCy universe

* updating outdated example codes

* resolving issues for the PR

* resolve review for klayers

* remove contributor-agreement file from the PR

* Update code example of spaCySentiWS

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update spacy-sentiws code example

Co-authored-by: schaeran <schaeran1994@gmail.com>
Co-authored-by: schaeran <schaeran@explosion.ai>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-04-15 15:36:54 +02:00
Philip Vollet
e63a5d4888
Update newsletter id (#10655) 2022-04-14 13:34:01 +02:00
Schero1994
caf8528af7
Batch #1 | spaCy universe cleanup (#10642)
* delete universe object: wmd-relax

* delete universe object: spaCy.jl

* delete universe object: saber

* delete universe object: languagecrunch

* delete universe object: gracyql

* delete universe object: ExcelCy

* delete universe object: EpiTator

Co-authored-by: schaeran <schaeran1994@gmail.com>
2022-04-14 10:08:19 +02:00
David Berenstein
d4196a62f1
added crosslingual coreference to spacy universe without additional commits (#10580)
* added crosslingual coreference to spacy universe

* Updated example to introduce batching example.

Co-authored-by: David Berenstein <david.berenstein@pandoraintelligence.com>
2022-04-08 08:23:58 +02:00
Bram Vanroy
f966bf6a15
Update to spacy_conll in universe (#10617)
* update to spacy_conll

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-04-04 17:57:52 +02:00
Adriane Boyd
85778dfcf4
Add edit tree lemmatizer (#10231)
* Add edit tree lemmatizer

Co-authored-by: Daniël de Kok <me@danieldk.eu>

* Hide edit tree lemmatizer labels

* Use relative imports

* Switch to single quotes in error message

* Type annotation fixes

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Reformat edit_tree_lemmatizer with black

* EditTreeLemmatizer.predict: take Iterable

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Validate edit trees during deserialization

This change also changes the serialized representation. Rather than
mirroring the deep C structure, we use a simple flat union of the match
and substitution node types.

* Move edit_trees to _edit_tree_internals

* Fix invalid edit tree format error message

* edit_tree_lemmatizer: remove outdated TODO comment

* Rename factory name to trainable_lemmatizer

* Ignore type instead of casting truths to List[Union[Ints1d, Floats2d, List[int], List[str]]] for thinc v8.0.14

* Switch to Tagger.v2

* Add documentation for EditTreeLemmatizer

* docs: Fix 3.2 -> 3.3 somewhere

* trainable_lemmatizer documentation fixes

* docs: EditTreeLemmatizer is in edit_tree_lemmatizer.py

Co-authored-by: Daniël de Kok <me@danieldk.eu>
Co-authored-by: Daniël de Kok <me@github.danieldk.eu>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-03-28 11:13:50 +02:00
David Berenstein
ed2ac34a8a
added Concise Concepts to spaCy universe (#10499)
* Update universe.json

added classy-classification to Spacy universe

* Update universe.json

added classy-classification to the spacy universe resources

* Update universe.json

corrected a small typo in json

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update universe.json

processed merge feedback

* Update universe.json

* updated information for Classy Classificaiton 

Made a more comprehensible and easy description for Classy Classification based on feedback of Philip Vollet to prepare for sharing.

* added note about examples

* corrected for wrong formatting changes

* Update website/meta/universe.json with small typo correction

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* resolved another typo

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* added Concise Concepts package to spaCy universe.

* updated example code Concise Concepts

* updated description for Concise Concepts

* updated PR with more visually appealing examples

SO to koaning for the suggestions.

* corrected for small json typo's in concise concepts

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-03-24 18:00:12 +01:00
Basile Dura
107bab56b5
docs: add EDS-NLP to spaCy universe (#10489)
* docs: add EDS-NLP to spaCy universe

* fix: remove "standalone" tag for EDS-NLP

Co-authored-by: Basile Dura <basile.dura-ext@aphp.fr>
2022-03-21 11:03:39 +01:00
Lj Miranda
0b02dc4c57
Fix mixed-up parameters for spacy-conll (#10516) 2022-03-18 08:56:21 +01:00
David Berenstein
e021dc6279
Updated explenation for for classy classification (#10484)
* Update universe.json

added classy-classification to Spacy universe

* Update universe.json

added classy-classification to the spacy universe resources

* Update universe.json

corrected a small typo in json

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update universe.json

processed merge feedback

* Update universe.json

* updated information for Classy Classificaiton 

Made a more comprehensible and easy description for Classy Classification based on feedback of Philip Vollet to prepare for sharing.

* added note about examples

* corrected for wrong formatting changes

* Update website/meta/universe.json with small typo correction

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* resolved another typo

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-03-15 16:42:33 +01:00
vincent d warmerdam
610001e8c7
Update universe.json (#10490)
The project moved away from Rasa and into my personal GitHub account.
2022-03-15 11:12:04 +01:00
Adriane Boyd
b2bbefd0b5
Add Finnish, Korean, and Swedish models and Korean support notes (#10355)
* Add Finnish, Korean, and Swedish models to website

* Add Korean language support notes
2022-03-07 17:03:45 +01:00
David Berenstein
a6d5824e5f
added classy-classification package to spacy universe (#10393)
* Update universe.json

added classy-classification to Spacy universe

* Update universe.json

added classy-classification to the spacy universe resources

* Update universe.json

corrected a small typo in json

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update universe.json

processed merge feedback

* Update universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-03-07 12:47:26 +01:00
Sam Edwardes
5f568f7e41
Updated spaCy universe for spacytextblob (#10335)
* Updated spacytextblob in universe.json

* Fixed json

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Added spacy_version tag to spacytextblob

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-02-24 14:18:10 +09:00
Paul O'Leary McCann
23bd103d89 Add tmtoolkit setup steps 2022-02-14 15:17:25 +09:00
Markus Konrad
8818a44a39
add tmtoolkit package to spaCy universe (#10245) 2022-02-14 15:16:43 +09:00
John Boy
10c77af83d
add textnets to spaCy universe (#10216)
https://github.com/jboynyc/textnets/issues/38
2022-02-09 15:04:26 +09:00
Ines Montani
7b883da9fd
Merge pull request #10239 from explosion/docs/spacy-tailored-pipelines [ci skip] 2022-02-08 18:04:01 +01:00
Ines Montani
f2c2b97e56 Add spaCy Tailored Pipelines 2022-02-08 11:46:42 +01:00
Kenneth Enevoldsen
e4625d2fc3
Added Augmenty to universe (#10229)
* Added Augmenty to universe

* Update website/meta/universe.json

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-02-08 08:32:11 +01:00
Kenneth Enevoldsen
a2f27ff83a
Added spacy-wrap to universe (#10168)
* Added spacy-wrap to universe 

Added spacy-wrap to universe a small package for wrapping fine-tuned huggingface transformers to a spacy pipeline following the same API as spacy-transformers. (Currently limited to classification models)

* Update website/meta/universe.json

* Update website/meta/universe.json

* Update website/meta/universe.json

* Update website/meta/universe.json

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-02-03 12:30:09 +01:00
Ines Montani
34ed93ef68
Support version tags in universe and add note about reporting (#10093)
* Support version tags in universe and add note about reporting

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-01-20 23:21:26 +01:00
Tuomo Hiippala
6a8619dd73
Update the entry for Applied Language Technology in spaCy Universe (#10068)
* add entry for Applied Language Technology under "Courses"

Added the following entry into `universe.json`:

```
        {
            "type": "education",
            "id": "applt-course",
            "title": "Applied Language Technology",
            "slogan": "NLP for newcomers using spaCy and Stanza",
            "description": "These learning materials provide an introduction to applied language technology for audiences who are unfamiliar with language technology and programming. The learning materials assume no previous knowledge of the Python programming language.",
            "url": "https://applied-language-technology.readthedocs.io/",
            "image": "https://www.mv.helsinki.fi/home/thiippal/images/applt-preview.jpg",
            "thumb": "https://applied-language-technology.readthedocs.io/en/latest/_static/logo.png",
            "author": "Tuomo Hiippala",
            "author_links": {
                "twitter": "tuomo_h",
                "github": "thiippal",
                "website": "https://www.mv.helsinki.fi/home/thiippal/"
            },
            "category": ["courses"]
        },
```

* Update the entry for "Applied Language Technology"
2022-01-17 08:28:51 +01:00
Ines Montani
a437ca6737 Update website to use new Algolia search API 2022-01-05 13:21:06 +01:00
Sam Edwardes
6f65e2b544
Added spacypdfreader to universe.json (#9963) 2022-01-03 16:34:36 +09:00
Paul O'Leary McCann
f40e237c5a
Remove denomme from universe (#9952)
Package seems to have been deleted.
2021-12-29 11:41:29 +01:00
Edward
018827e9fd Add healthsea to universe (#9838)
* Add healthsea to universe

* Update website/meta/universe.json

* Add thumbnail

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-12-15 17:57:19 +01:00
Tuomo Hiippala
5c44533263
add entry for Applied Language Technology under "Courses" (#9755)
Added the following entry into `universe.json`:

```
        {
            "type": "education",
            "id": "applt-course",
            "title": "Applied Language Technology",
            "slogan": "NLP for newcomers using spaCy and Stanza",
            "description": "These learning materials provide an introduction to applied language technology for audiences who are unfamiliar with language technology and programming. The learning materials assume no previous knowledge of the Python programming language.",
            "url": "https://applied-language-technology.readthedocs.io/",
            "image": "https://www.mv.helsinki.fi/home/thiippal/images/applt-preview.jpg",
            "thumb": "https://applied-language-technology.readthedocs.io/en/latest/_static/logo.png",
            "author": "Tuomo Hiippala",
            "author_links": {
                "twitter": "tuomo_h",
                "github": "thiippal",
                "website": "https://www.mv.helsinki.fi/home/thiippal/"
            },
            "category": ["courses"]
        },
```
2021-11-28 19:33:16 +09:00
Vishnu Nandakumar
86fa37e8ba
Update universe.json with new library eng_spacysentiment (#9679)
* Update universe.json

* Update universe.json

* Cleanup fields

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
2021-11-16 14:06:19 +09:00
Adriane Boyd
216ed231a9 What's new in v3.2 (#9633)
* What's new in v3.2

* Fix formatting

* Fix typo

* Redo thanks

* Formatting

* Fix typo

* Fix project links

* Fix typo

* Minimal intro, floret python module

* Rephrase

* Rephrase, extend

* Rephrase

* Update links and formatting [ci skip]

* Minor correction

* Fix typo

Co-authored-by: Ines Montani <ines@ines.io>
2021-11-05 16:31:14 +01:00
Adriane Boyd
07dea324f6 Merge remote-tracking branch 'upstream/develop' into chore/switch-to-master-v3.2.0 2021-11-03 15:32:18 +01:00
xxyzz
90ec820f05
Add WordDumb to spaCy Universe (#9572)
* Add WordDumb to spaCy Universe

* Add standalone category

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
2021-11-01 18:38:41 +09:00
Bruce W. Lee (이웅성)
a4dcb68cf6
Adding LingFeat Software to spaCy Universe. (#9574)
* add lingfeat in universe

* add lingfeat in universe

* Fix JSON

* Minor cleanup

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
2021-11-01 18:38:14 +09:00
Adriane Boyd
2d430958e1 Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.2-3 2021-10-29 12:18:15 +02:00
Philip Vollet
76173b0866
fixed typo and URL (#9560) 2021-10-29 13:57:44 +09:00
Adriane Boyd
a803af9dfa Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.2-1 2021-10-26 11:53:50 +02:00
Duygu Altinok
7b98aa4c16
Corrected broken (#9505) 2021-10-20 17:31:59 +02:00
Adriane Boyd
3f181b73d0
Add ja_core_news_trf to website (#9515) 2021-10-20 10:18:02 +02:00
Adriane Boyd
9b86209a4a
Update docs for spacy-transformers v1.1 data classes (#9361) 2021-10-18 14:16:58 +02:00
Edward
72711dc2c9
Update universe example codes (#9422)
* Update universe plugins

* Adjust azure trigger

* Add init to tests/universe

* deliberatly trying to break the universe to see if the CI catches it

* revert

Co-authored-by: svlandeg <svlandeg@github.com>
2021-10-13 16:29:19 +02:00
Paul O'Leary McCann
78a88f7de7 Fix invalid json 2021-09-30 15:23:55 +09:00
Martin Vallone
a14ab7e882
Adding PhruzzMatcher to spaCy universe (#9321)
* Adding PhruzzMatcher to spaCy universe

* Fixes to make the package work properly
2021-09-30 13:46:53 +09:00
Philip Vollet
d2adfe1efa
Add projects to spaCy Universe (#9269)
* Added spaCy Universe projects

* Added user license agreement Philip Vollet

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-09-23 10:56:45 +02:00
Edward
8bda39f088
Update Hammurabi example code to v3 (#9218)
* Update Hammurabi example code

* Fix typo
2021-09-16 13:32:44 +02:00
Renat Shigapov
646f3a54db
added spaCyOpenTapioca (#9181)
* add spaCyOpenTapioca to universe

* add agreement

* fix misprint in tags
2021-09-11 13:16:51 +09:00
mylibrar
ee28aac68e
Update example code of forte (#9175)
Co-authored-by: Suqi Sun <suqi.sun@petuum.com>
2021-09-11 13:13:13 +09:00
Meenal Jhajharia
2613f0e98f
benepar usage example has deprecated imports 2021-08-28 16:35:58 +05:30
Ines Montani
f2b61b77a5 Fix universe.json [ci skip] 2021-08-20 11:26:29 +10:00
Baltazar
71e65fe943 added spacy api v3 docker 2021-08-19 21:29:25 +02:00
Lasse
839ea0f987 change tags formatting to match 2021-08-13 14:40:08 +02:00
Lasse
195e4e48c3 add textdescriptives to universe 2021-08-13 14:35:18 +02:00
Duygu Altinok
380b2817cf
updated unv json for new book 2021-08-09 12:39:22 +02:00
Ledenel
413f745c68 fix broken example in spaCy universe Chatterbot 2021-07-25 15:53:32 +00:00
Paul O'Leary McCann
d717593eb7
Merge pull request #8754 from KennethEnevoldsen/patch-1
[minor] removed outdated spacy version for spacymoji
2021-07-18 19:17:33 +09:00
Kenneth Enevoldsen
5d6aed0773
fixed GitHub link and thumbnail
Sorry, I seem to have misunderstood that the GitHub reference shouldn't be a link.
2021-07-18 10:22:00 +02:00
Ines Montani
313f55e560 Fix JSON [ci skip] 2021-07-18 13:21:33 +10:00
Ines Montani
51e5903d6f
Merge pull request #8702 from KennethEnevoldsen/master [ci skip] 2021-07-18 13:18:42 +10:00
Kenneth Enevoldsen
8546948fba
removed outdated spacy version for spacymoji
From the documentation of spacymoji (and the requirements.txt) it seems like it is not only for version 2.
2021-07-17 15:19:43 +02:00
Kenneth Enevoldsen
a0e0ccdb46
Update website/meta/universe.json
Co-authored-by: Ines Montani <ines@ines.io>
2021-07-17 07:14:46 +02:00
Mario Šaško
1ba2e8a646
Add TakeLab/spacy-udpipe to Universe (#8698)
* Add TakeLab/spacy-udpipe to universe

* Add SCA

* Sign SCA
2021-07-16 11:15:52 +02:00
thomashacker
aafb89df78 Update universe.json code_example 2021-07-13 10:22:49 +02:00
Kenneth Enevoldsen
94ce904e10
added missing comma 2021-07-13 09:59:34 +02:00
Kenneth Enevoldsen
a81fcc81b0
added dacy to universe 2021-07-13 09:54:08 +02:00
Adriane Boyd
1ee5bee29d
Add Macedonian models to website (#8637) 2021-07-08 09:32:14 +02:00
Paul O'Leary McCann
1d9209d43a
Merge pull request #8547 from mylibrar/update-universe
Add forte to universe.json
2021-07-08 14:59:49 +09:00
Ines Montani
04a9ade40f
Merge pull request #8466 from explosion/docs/new-in-v3-1 [ci skip] 2021-07-06 22:20:24 +10:00
Yoichiro Hasebe
596e04cbb4
Github repo info fixed for ruby-spacy 2021-07-04 18:55:17 +09:00
Yoichiro Hasebe
2bdfa42107
Update universe.json 2021-07-04 08:44:39 +09:00
Suqi Sun
3901507df8 Update pip 2021-06-30 16:44:43 -04:00
Suqi Sun
61c868ed75 Update pip and code example 2021-06-30 14:49:51 -04:00
Suqi Sun
4331c40b78 Add forte to universe.json 2021-06-29 16:17:22 -04:00
Nick Sorros
bb781ae7f7
Remove extra parenthesis from the example for spacy-streamlit (#8527) 2021-06-28 14:03:31 +02:00
Kevin
1a3e7cc5ef Updated PyATE syntax to fit spaCy V3 2021-06-26 17:52:41 -07:00
Matthew Honnibal
f9946154d9
Add SpanCategorizer component (#6747)
* Draft spancat model

* Add spancat model

* Add test for extract_spans

* Add extract_spans layer

* Upd extract_spans

* Add spancat model

* Add test for spancat model

* Upd spancat model

* Update spancat component

* Upd spancat

* Update spancat model

* Add quick spancat test

* Import SpanCategorizer

* Fix SpanCategorizer component

* Import SpanGroup

* Fix span extraction

* Fix import

* Fix import

* Upd model

* Update spancat models

* Add scoring, update defaults

* Update and add docs

* Fix type

* Update spacy/ml/extract_spans.py

* Auto-format and fix import

* Fix comment

* Fix type

* Fix type

* Update website/docs/api/spancategorizer.md

* Fix comment

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Better defense

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Fix labels list

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update spacy/ml/extract_spans.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update spacy/pipeline/spancat.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Set annotations during update

* Set annotations in spancat

* fix imports in test

* Update spacy/pipeline/spancat.py

* replace MaxoutLogistic with LinearLogistic

* fix config

* various small fixes

* remove set_annotations parameter in update

* use our beloved tupley format with recent support for doc.spans

* bugfix to allow renaming the default span_key (scores weren't showing up)

* use different key in docs example

* change defaults to better-working parameters from project (WIP)

* register spacy.extract_spans.v1 for legacy purposes

* Upd dev version so can build wheel

* layers instead of architectures for smaller building blocks

* Update website/docs/api/spancategorizer.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/spancategorizer.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Include additional scores from overrides in combined score weights

* Parameterize spans key in scoring

Parameterize the `SpanCategorizer` `spans_key` for scoring purposes so
that it's possible to evaluate multiple `spancat` components in the same
pipeline.

* Use the (intentionally very short) default spans key `sc` in the
  `SpanCategorizer`
* Adjust the default score weights to include the default key
* Adjust the scorer to use `spans_{spans_key}` as the prefix for the
  returned score
* Revert addition of `attr_name` argument to `score_spans` and adjust
  the key in the `getter` instead.

Note that for `spancat` components with a custom `span_key`, the score
weights currently need to be modified manually in
`[training.score_weights]` for them to be available during training. To
suppress the default score weights `spans_sc_p/r/f` during training, set
them to `null` in `[training.score_weights]`.

* Update website/docs/api/scorer.md

* Fix scorer for spans key containing underscore

* Increment version

* Add Spans to Evaluate CLI (#8439)

* Add Spans to Evaluate CLI

* Change to spans_key

* Add spans per_type output

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Fix spancat GPU issues (#8455)

* Fix GPU issues

* Require thinc >=8.0.6

* Switch to glorot_uniform_init

* Fix and test ngram suggester

* Include final ngram in doc for all sizes
* Fix ngrams for docs of the same length as ngram size
* Handle batches of docs that result in no ngrams
* Add tests

Co-authored-by: Ines Montani <ines@ines.io>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Nirant <NirantK@users.noreply.github.com>
2021-06-24 12:35:27 +02:00
Ines Montani
bc93c34f54 Add "New in v3.1" guide 2021-06-22 15:23:18 +10:00
Adriane Boyd
5646fcbe46 Merge remote-tracking branch 'upstream/develop' into chore/develop-into-master-v3.1 2021-06-15 15:05:17 +02:00
Adriane Boyd
507422149f
Various docs updates for v3.0 (#8353)
* Update cats score names in Scorer API docs

* Refer to performance in meta

* Update package naming/versions, lemmatizer details

* Minor formatting fixes

* Provide more explanation for cats_score_desc

* Provide language-specific lemmatizer defaults in API docs

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
2021-06-14 12:19:36 +02:00
Adriane Boyd
63d748f80e
Add Catalan and Danish trf to website models (#8378) 2021-06-14 09:50:13 +02:00
Ines Montani
7f0f674a1b Fix universe.json and auto-format [ci skip] 2021-06-14 10:18:06 +10:00
Francisco Aranda
0a1a4c665d
update spacy-wordnet code example (#8327)
* update spacy-wordnet code example

- include spaCy 2.x and 3.x init alternatives
- upgrade recognai logo

* fix escape chars
2021-06-10 21:53:11 +02:00
Paul O'Leary McCann
5aba213349 Fix skweak Github URL
Github entry should not contain url, just user/repo
2021-05-31 18:00:43 +09:00
Kristian Boda
dc8d8d15d2
Add hmrb to spaCy Universe (#8129)
* docs: add hmrb to spacy universe

* docs: add sentence on spacy versions

* docs: update description and images

* misc: add spaCy Contributor Agreement
2021-05-31 18:40:48 +10:00
Julien Salinas
c496f78245 Add NLP Cloud to Universe. 2021-05-14 11:13:44 +02:00
Frederic R. Hopp
c5962b9fba
Update universe.json
fixed typo
2021-05-13 07:40:05 -07:00
Frederic R. Hopp
a9ca221e03
Update universe.json
Added more detailed description to eMFDscore project
2021-05-12 09:20:17 -07:00
Frederic R. Hopp
7bba9cdc14
Update universe.json 2021-05-11 19:18:19 -07:00
Jeno Pizarro
5cf76ab608
Update negspacy example code for spaCy 3.0 (#8022) 2021-05-07 09:33:21 +02:00
meghanabhange
debaab7021
Update details in universe denomme | Multilingual Name Detection (#7982)
* Add denomme

* spaCy contributor agreement

* Update install and thumb

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2021-05-05 17:12:13 +02:00
meghanabhange
49ff1126bf
Project Idea : denomme | Multilingual Name Detection (#7845)
* Add denomme

* spaCy contributor agreement

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2021-04-22 08:48:17 +02:00
Sam Edwardes
b8c6c10c6f
Added a logo to spaCyTextBlob (#7818)
* Added a logo to spaCyTextBlob

* Updated to better thumb
2021-04-22 08:41:55 +02:00
Diego Palma
bbade153ed
Add TRUNAJOD to spaCy universe. (#7754)
* Add TRUNAJOD to spaCy universe.

* Add trunajod logo and thumb.

Co-authored-by: Diego <dpalma@evernote.com>
2021-04-22 08:40:28 +02:00
Ines Montani
a9e5ae9b5c Auto-format [ci skip] 2021-04-22 10:58:05 +10:00
Pierre Lison
debfb46088 adding skweak to the SpaCy universe 2021-04-22 00:58:09 +02:00
hudsonr
2722424ec5 Added universe entry for Coreferee 2021-04-19 14:28:06 +02:00
Jaidev Deshpande
93ee74a0a6
Add Numerizer to SpaCy universe (#7650)
Numerizer is a spaCy extension that converts numbers written in natural language
into numeric strings.
2021-04-05 19:02:27 +02:00
Sam Edwardes
f6ad4684bd
Updates to universe.json for spaCyTextBlob (#7647)
* Updates to universe.json for spaCyTextBlob

Updated the documentation for spaCy 3.0.

* SamEdwardes.md

* Update SamEdwardes.md
2021-04-04 20:17:57 +02:00
vincent d warmerdam
8b3eec6e62
Add Tokenwiser to Projects (#7541)
* Add tokenwiser

* Update universe.json
2021-04-01 14:39:36 +02:00
Sofie Van Landeghem
59c2069eb1
Legacy docs (#7601)
* document legacy Tok2Vec architectures

* add TextCatEnsemble.v1 legacy documentation

* Separate legacy section in side bar
2021-03-30 12:43:14 +02:00
Paolo Arduin
00e59be966 Add SpikeX to spaCy universe 2021-03-16 18:22:03 +01:00
vincent d warmerdam
1b0d413e45
Removed Languages that were listed twice on Docs (#7272)
* removed languages that were listed twice

* sorted

* d0h

* the d0h strikes back when you dont hit save
2021-03-05 14:31:15 +01:00
Ines Montani
d2c515354b Auto-format [ci skip] 2021-02-24 22:37:32 +11:00
Ines Montani
9e8a7e08c1
Merge pull request #7115 from SergeyShk/ruts [ci skip] 2021-02-24 22:37:00 +11:00
Shkarin Sergey
22706ec9fb Fixed universe.json 2021-02-20 08:02:38 +03:00
Ines Montani
fc4fb6eb3a Make v2.x docs more prominent [ci skip] 2021-02-17 23:42:27 +11:00
Rajat
4e80ef3abb
updated code eg & description of contextualSpellCheck (#7096) 2021-02-17 13:26:43 +01:00
Shkarin Sergey
abac5dc203
Update universe.json 2021-02-15 15:01:46 +03:00
Ines Montani
4b729660bd
Merge pull request #7051 from MartinoMensio/dbpedia-spotlight [ci skip]
added spacy-dbpedia-spotlight
2021-02-14 14:06:08 +11:00
Ines Montani
06e66d4ced Update languages.json [ci skip] 2021-02-13 12:33:17 +11:00
Martino Mensio
6c0c3d5ddc
added spacy-dbpedia-spotlight 2021-02-12 19:11:35 +01:00
Ines Montani
6a683970ea Update Binder meta [ci skip] 2021-01-31 15:43:08 +11:00
Ines Montani
ae07416fda Merge branch 'website/v3-launch' into develop 2021-01-30 20:31:06 +11:00
Ines Montani
d3350afe45 Update docs and add support for legacy style 2021-01-30 17:43:12 +11:00
Ines Montani
230e651ad6 Merge branch 'develop' into master-tmp 2021-01-27 13:26:29 +11:00
muratjumashev
7d0154a36e Added language meta data 2021-01-25 00:42:19 +06:00
Adriane Boyd
7cd5c9e098 Add xx_sent_ud_sm model to website 2021-01-19 09:02:35 +01:00
Adriane Boyd
e8f6400923 Update languages for website
* Add Macedonian
* Add Russian dependencies
* Switch Chinese dependency to spacy-pkuseg
2021-01-18 14:09:34 +01:00
Ines Montani
09cacbb7ee Fix website [ci skip] 2021-01-18 11:37:04 +11:00
Adriane Boyd
0c936004d1 Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-rc3 2021-01-14 11:49:58 +01:00
Matthew Honnibal
f277bfdf0f
Add SpanGroup and Graph container types to represent arbitrary annotations (#6696)
* Draft out initial Spans data structure

* Initial span group commit

* Basic span group support on Doc

* Basic test for span group

* Compile span_group.pyx

* Draft addition of SpanGroup to DocBin

* Add deserialization for SpanGroup

* Add tests for serializing SpanGroup

* Fix serialization of SpanGroup

* Add EdgeC and GraphC structs

* Add draft Graph data structure

* Compile graph

* More work on Graph

* Update GraphC

* Upd graph

* Fix walk functions

* Let Graph take nodes and edges on construction

* Fix walking and getting

* Add graph tests

* Fix import

* Add module with the SpanGroups dict thingy

* Update test

* Rename 'span_groups' attribute

* Try to fix c++11 compilation

* Fix test

* Update DocBin

* Try to fix compilation

* Try to fix graph

* Improve SpanGroup docstrings

* Add doc.spans to documentation

* Fix serialization

* Tidy up and add docs

* Update docs [ci skip]

* Add SpanGroup.has_overlap

* WIP updated Graph API

* Start testing new Graph API

* Update Graph tests

* Update Graph

* Add docstring

Co-authored-by: Ines Montani <ines@ines.io>
2021-01-14 17:30:41 +11:00
Antonio Miras
b4bd8f347a
spaCy Universe: New project; SpacyDotNet (#6702)
* Universe: SpacyDotNet a .NET Core spaCy wrapper

* Signed contributor agreement

Co-authored-by: Antonio Miras <antonio@amiras.net>
2021-01-13 12:47:30 +11:00
Jeno Pizarro
a6fe35a0f9
Update universe.json 2020-12-15 21:53:20 -05:00
Jeno Pizarro
343a44abe9 Merge branch 'master' of https://github.com/explosion/spaCy 2020-12-15 21:49:46 -05:00
Ines Montani
85ca8c2bdd Merge branch 'master' into develop 2020-12-11 13:44:41 +11:00
Ines Montani
76cfd89dea Update site.json 2020-12-11 10:19:42 +11:00
Ines Montani
43a69eecb7 Update site.json 2020-12-11 10:05:21 +11:00
svlandeg
d156b423ae remove gitter and reddit links 2020-12-10 20:41:02 +01:00
Adriane Boyd
724831b066 Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master
* Update Macedonian for v3
* Update Turkish for v3
2020-11-25 11:49:34 +01:00
Yusuke Mori
e3ac90b035
Avoid a SyntaxError in self-attentive-parser (#6428)
* Avoid a SyntaxError in self-attentive-parser

Fix a usage of quotation marks in the example of spaCy Universe self-attentive-parser

* Create forest1988.md

Fill in the spaCy contributor agreement
2020-11-22 21:59:37 +01:00
M. Revuelta Espinosa
51232ffb9e
Update universe.json (include PatternOmatic) (#6399)
Request to include PatternOmatic in spaCy Universe

Adds @revuel to contributors
2020-11-19 13:15:50 +01:00
Adriane Boyd
3cf6479467 Fix JSON in #6395 2020-11-17 15:25:41 +01:00
Sam Edwardes
78913a4f95
Added spaCyTextBlob to universe.json (#6395) 2020-11-17 14:38:34 +01:00
Alec Chapman
204c7c8a00 fix thumbnail link to be github raw url 2020-11-01 07:53:48 -07:00
Alec Chapman
73d22d96ff add medspacy to universe and fix example w/ cov-bsv 2020-10-29 07:53:56 -06:00
Adriane Boyd
8cc5ed6771 Add Macedonian to website languages 2020-10-29 08:49:56 +01:00