Commit Graph

312 Commits

Author SHA1 Message Date
Adriane Boyd
1ee5bee29d
Add Macedonian models to website (#8637) 2021-07-08 09:32:14 +02:00
Paul O'Leary McCann
1d9209d43a
Merge pull request #8547 from mylibrar/update-universe
Add forte to universe.json
2021-07-08 14:59:49 +09:00
Ines Montani
04a9ade40f
Merge pull request #8466 from explosion/docs/new-in-v3-1 [ci skip] 2021-07-06 22:20:24 +10:00
Yoichiro Hasebe
596e04cbb4
Github repo info fixed for ruby-spacy 2021-07-04 18:55:17 +09:00
Yoichiro Hasebe
2bdfa42107
Update universe.json 2021-07-04 08:44:39 +09:00
Suqi Sun
3901507df8 Update pip 2021-06-30 16:44:43 -04:00
Suqi Sun
61c868ed75 Update pip and code example 2021-06-30 14:49:51 -04:00
Suqi Sun
4331c40b78 Add forte to universe.json 2021-06-29 16:17:22 -04:00
Nick Sorros
bb781ae7f7
Remove extra parenthesis from the example for spacy-streamlit (#8527) 2021-06-28 14:03:31 +02:00
Kevin
1a3e7cc5ef Updated PyATE syntax to fit spaCy V3 2021-06-26 17:52:41 -07:00
Matthew Honnibal
f9946154d9
Add SpanCategorizer component (#6747)
* Draft spancat model

* Add spancat model

* Add test for extract_spans

* Add extract_spans layer

* Upd extract_spans

* Add spancat model

* Add test for spancat model

* Upd spancat model

* Update spancat component

* Upd spancat

* Update spancat model

* Add quick spancat test

* Import SpanCategorizer

* Fix SpanCategorizer component

* Import SpanGroup

* Fix span extraction

* Fix import

* Fix import

* Upd model

* Update spancat models

* Add scoring, update defaults

* Update and add docs

* Fix type

* Update spacy/ml/extract_spans.py

* Auto-format and fix import

* Fix comment

* Fix type

* Fix type

* Update website/docs/api/spancategorizer.md

* Fix comment

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Better defense

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Fix labels list

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update spacy/ml/extract_spans.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update spacy/pipeline/spancat.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Set annotations during update

* Set annotations in spancat

* fix imports in test

* Update spacy/pipeline/spancat.py

* replace MaxoutLogistic with LinearLogistic

* fix config

* various small fixes

* remove set_annotations parameter in update

* use our beloved tupley format with recent support for doc.spans

* bugfix to allow renaming the default span_key (scores weren't showing up)

* use different key in docs example

* change defaults to better-working parameters from project (WIP)

* register spacy.extract_spans.v1 for legacy purposes

* Upd dev version so can build wheel

* layers instead of architectures for smaller building blocks

* Update website/docs/api/spancategorizer.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/spancategorizer.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Include additional scores from overrides in combined score weights

* Parameterize spans key in scoring

Parameterize the `SpanCategorizer` `spans_key` for scoring purposes so
that it's possible to evaluate multiple `spancat` components in the same
pipeline.

* Use the (intentionally very short) default spans key `sc` in the
  `SpanCategorizer`
* Adjust the default score weights to include the default key
* Adjust the scorer to use `spans_{spans_key}` as the prefix for the
  returned score
* Revert addition of `attr_name` argument to `score_spans` and adjust
  the key in the `getter` instead.

Note that for `spancat` components with a custom `span_key`, the score
weights currently need to be modified manually in
`[training.score_weights]` for them to be available during training. To
suppress the default score weights `spans_sc_p/r/f` during training, set
them to `null` in `[training.score_weights]`.

* Update website/docs/api/scorer.md

* Fix scorer for spans key containing underscore

* Increment version

* Add Spans to Evaluate CLI (#8439)

* Add Spans to Evaluate CLI

* Change to spans_key

* Add spans per_type output

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Fix spancat GPU issues (#8455)

* Fix GPU issues

* Require thinc >=8.0.6

* Switch to glorot_uniform_init

* Fix and test ngram suggester

* Include final ngram in doc for all sizes
* Fix ngrams for docs of the same length as ngram size
* Handle batches of docs that result in no ngrams
* Add tests

Co-authored-by: Ines Montani <ines@ines.io>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Nirant <NirantK@users.noreply.github.com>
2021-06-24 12:35:27 +02:00
Ines Montani
bc93c34f54 Add "New in v3.1" guide 2021-06-22 15:23:18 +10:00
Adriane Boyd
5646fcbe46 Merge remote-tracking branch 'upstream/develop' into chore/develop-into-master-v3.1 2021-06-15 15:05:17 +02:00
Adriane Boyd
507422149f
Various docs updates for v3.0 (#8353)
* Update cats score names in Scorer API docs

* Refer to performance in meta

* Update package naming/versions, lemmatizer details

* Minor formatting fixes

* Provide more explanation for cats_score_desc

* Provide language-specific lemmatizer defaults in API docs

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
2021-06-14 12:19:36 +02:00
Adriane Boyd
63d748f80e
Add Catalan and Danish trf to website models (#8378) 2021-06-14 09:50:13 +02:00
Ines Montani
7f0f674a1b Fix universe.json and auto-format [ci skip] 2021-06-14 10:18:06 +10:00
Francisco Aranda
0a1a4c665d
update spacy-wordnet code example (#8327)
* update spacy-wordnet code example

- include spaCy 2.x and 3.x init alternatives
- upgrade recognai logo

* fix escape chars
2021-06-10 21:53:11 +02:00
Paul O'Leary McCann
5aba213349 Fix skweak Github URL
Github entry should not contain url, just user/repo
2021-05-31 18:00:43 +09:00
Kristian Boda
dc8d8d15d2
Add hmrb to spaCy Universe (#8129)
* docs: add hmrb to spacy universe

* docs: add sentence on spacy versions

* docs: update description and images

* misc: add spaCy Contributor Agreement
2021-05-31 18:40:48 +10:00
Julien Salinas
c496f78245 Add NLP Cloud to Universe. 2021-05-14 11:13:44 +02:00
Frederic R. Hopp
c5962b9fba
Update universe.json
fixed typo
2021-05-13 07:40:05 -07:00
Frederic R. Hopp
a9ca221e03
Update universe.json
Added more detailed description to eMFDscore project
2021-05-12 09:20:17 -07:00
Frederic R. Hopp
7bba9cdc14
Update universe.json 2021-05-11 19:18:19 -07:00
Jeno Pizarro
5cf76ab608
Update negspacy example code for spaCy 3.0 (#8022) 2021-05-07 09:33:21 +02:00
meghanabhange
debaab7021
Update details in universe denomme | Multilingual Name Detection (#7982)
* Add denomme

* spaCy contributor agreement

* Update install and thumb

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2021-05-05 17:12:13 +02:00
meghanabhange
49ff1126bf
Project Idea : denomme | Multilingual Name Detection (#7845)
* Add denomme

* spaCy contributor agreement

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2021-04-22 08:48:17 +02:00
Sam Edwardes
b8c6c10c6f
Added a logo to spaCyTextBlob (#7818)
* Added a logo to spaCyTextBlob

* Updated to better thumb
2021-04-22 08:41:55 +02:00
Diego Palma
bbade153ed
Add TRUNAJOD to spaCy universe. (#7754)
* Add TRUNAJOD to spaCy universe.

* Add trunajod logo and thumb.

Co-authored-by: Diego <dpalma@evernote.com>
2021-04-22 08:40:28 +02:00
Ines Montani
a9e5ae9b5c Auto-format [ci skip] 2021-04-22 10:58:05 +10:00
Pierre Lison
debfb46088 adding skweak to the SpaCy universe 2021-04-22 00:58:09 +02:00
hudsonr
2722424ec5 Added universe entry for Coreferee 2021-04-19 14:28:06 +02:00
Jaidev Deshpande
93ee74a0a6
Add Numerizer to SpaCy universe (#7650)
Numerizer is a spaCy extension that converts numbers written in natural language
into numeric strings.
2021-04-05 19:02:27 +02:00
Sam Edwardes
f6ad4684bd
Updates to universe.json for spaCyTextBlob (#7647)
* Updates to universe.json for spaCyTextBlob

Updated the documentation for spaCy 3.0.

* SamEdwardes.md

* Update SamEdwardes.md
2021-04-04 20:17:57 +02:00
vincent d warmerdam
8b3eec6e62
Add Tokenwiser to Projects (#7541)
* Add tokenwiser

* Update universe.json
2021-04-01 14:39:36 +02:00
Sofie Van Landeghem
59c2069eb1
Legacy docs (#7601)
* document legacy Tok2Vec architectures

* add TextCatEnsemble.v1 legacy documentation

* Separate legacy section in side bar
2021-03-30 12:43:14 +02:00
Paolo Arduin
00e59be966 Add SpikeX to spaCy universe 2021-03-16 18:22:03 +01:00
vincent d warmerdam
1b0d413e45
Removed Languages that were listed twice on Docs (#7272)
* removed languages that were listed twice

* sorted

* d0h

* the d0h strikes back when you dont hit save
2021-03-05 14:31:15 +01:00
Ines Montani
d2c515354b Auto-format [ci skip] 2021-02-24 22:37:32 +11:00
Ines Montani
9e8a7e08c1
Merge pull request #7115 from SergeyShk/ruts [ci skip] 2021-02-24 22:37:00 +11:00
Shkarin Sergey
22706ec9fb Fixed universe.json 2021-02-20 08:02:38 +03:00
Ines Montani
fc4fb6eb3a Make v2.x docs more prominent [ci skip] 2021-02-17 23:42:27 +11:00
Rajat
4e80ef3abb
updated code eg & description of contextualSpellCheck (#7096) 2021-02-17 13:26:43 +01:00
Shkarin Sergey
abac5dc203
Update universe.json 2021-02-15 15:01:46 +03:00
Ines Montani
4b729660bd
Merge pull request #7051 from MartinoMensio/dbpedia-spotlight [ci skip]
added spacy-dbpedia-spotlight
2021-02-14 14:06:08 +11:00
Ines Montani
06e66d4ced Update languages.json [ci skip] 2021-02-13 12:33:17 +11:00
Martino Mensio
6c0c3d5ddc
added spacy-dbpedia-spotlight 2021-02-12 19:11:35 +01:00
Ines Montani
6a683970ea Update Binder meta [ci skip] 2021-01-31 15:43:08 +11:00
Ines Montani
ae07416fda Merge branch 'website/v3-launch' into develop 2021-01-30 20:31:06 +11:00
Ines Montani
d3350afe45 Update docs and add support for legacy style 2021-01-30 17:43:12 +11:00
Ines Montani
230e651ad6 Merge branch 'develop' into master-tmp 2021-01-27 13:26:29 +11:00
muratjumashev
7d0154a36e Added language meta data 2021-01-25 00:42:19 +06:00
Adriane Boyd
7cd5c9e098 Add xx_sent_ud_sm model to website 2021-01-19 09:02:35 +01:00
Adriane Boyd
e8f6400923 Update languages for website
* Add Macedonian
* Add Russian dependencies
* Switch Chinese dependency to spacy-pkuseg
2021-01-18 14:09:34 +01:00
Ines Montani
09cacbb7ee Fix website [ci skip] 2021-01-18 11:37:04 +11:00
Adriane Boyd
0c936004d1 Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-rc3 2021-01-14 11:49:58 +01:00
Matthew Honnibal
f277bfdf0f
Add SpanGroup and Graph container types to represent arbitrary annotations (#6696)
* Draft out initial Spans data structure

* Initial span group commit

* Basic span group support on Doc

* Basic test for span group

* Compile span_group.pyx

* Draft addition of SpanGroup to DocBin

* Add deserialization for SpanGroup

* Add tests for serializing SpanGroup

* Fix serialization of SpanGroup

* Add EdgeC and GraphC structs

* Add draft Graph data structure

* Compile graph

* More work on Graph

* Update GraphC

* Upd graph

* Fix walk functions

* Let Graph take nodes and edges on construction

* Fix walking and getting

* Add graph tests

* Fix import

* Add module with the SpanGroups dict thingy

* Update test

* Rename 'span_groups' attribute

* Try to fix c++11 compilation

* Fix test

* Update DocBin

* Try to fix compilation

* Try to fix graph

* Improve SpanGroup docstrings

* Add doc.spans to documentation

* Fix serialization

* Tidy up and add docs

* Update docs [ci skip]

* Add SpanGroup.has_overlap

* WIP updated Graph API

* Start testing new Graph API

* Update Graph tests

* Update Graph

* Add docstring

Co-authored-by: Ines Montani <ines@ines.io>
2021-01-14 17:30:41 +11:00
Antonio Miras
b4bd8f347a
spaCy Universe: New project; SpacyDotNet (#6702)
* Universe: SpacyDotNet a .NET Core spaCy wrapper

* Signed contributor agreement

Co-authored-by: Antonio Miras <antonio@amiras.net>
2021-01-13 12:47:30 +11:00
Jeno Pizarro
a6fe35a0f9
Update universe.json 2020-12-15 21:53:20 -05:00
Jeno Pizarro
343a44abe9 Merge branch 'master' of https://github.com/explosion/spaCy 2020-12-15 21:49:46 -05:00
Ines Montani
85ca8c2bdd Merge branch 'master' into develop 2020-12-11 13:44:41 +11:00
Ines Montani
76cfd89dea Update site.json 2020-12-11 10:19:42 +11:00
Ines Montani
43a69eecb7 Update site.json 2020-12-11 10:05:21 +11:00
svlandeg
d156b423ae remove gitter and reddit links 2020-12-10 20:41:02 +01:00
Adriane Boyd
724831b066 Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master
* Update Macedonian for v3
* Update Turkish for v3
2020-11-25 11:49:34 +01:00
Yusuke Mori
e3ac90b035
Avoid a SyntaxError in self-attentive-parser (#6428)
* Avoid a SyntaxError in self-attentive-parser

Fix a usage of quotation marks in the example of spaCy Universe self-attentive-parser

* Create forest1988.md

Fill in the spaCy contributor agreement
2020-11-22 21:59:37 +01:00
M. Revuelta Espinosa
51232ffb9e
Update universe.json (include PatternOmatic) (#6399)
Request to include PatternOmatic in spaCy Universe

Adds @revuel to contributors
2020-11-19 13:15:50 +01:00
Adriane Boyd
3cf6479467 Fix JSON in #6395 2020-11-17 15:25:41 +01:00
Sam Edwardes
78913a4f95
Added spaCyTextBlob to universe.json (#6395) 2020-11-17 14:38:34 +01:00
Alec Chapman
204c7c8a00 fix thumbnail link to be github raw url 2020-11-01 07:53:48 -07:00
Alec Chapman
73d22d96ff add medspacy to universe and fix example w/ cov-bsv 2020-10-29 07:53:56 -06:00
Adriane Boyd
8cc5ed6771 Add Macedonian to website languages 2020-10-29 08:49:56 +01:00
Adriane Boyd
4dd86306e9
Add Nepali to supported languages on website (#6315) 2020-10-28 16:32:07 +01:00
Kunal Sharma
01aec7a313
Adding MindMeld to Universe JSON (#6275)
* Adding Mindmeld to Universe JSON

Mindmeld is a conversational AI platform for deep-domain voice interfaces and chatbots. https://www.mindmeld.com/

* Signing contribution agreement.

Co-authored-by: kunshar2 <kunshar2@cisco.com>
2020-10-21 18:42:11 +02:00
Adriane Boyd
e896803792 Add and update website license links 2020-10-16 17:01:52 +02:00
Ines Montani
050aa1e0e2 Update languages.json [ci skip] 2020-10-14 20:51:50 +02:00
Ines Montani
a966c271f7 Update models docs [ci skip] 2020-10-14 20:50:23 +02:00
Ines Montani
7c52def5da
Merge pull request #6227 from adrianeboyd/chore/update-3.0.0a36-from-master 2020-10-09 10:49:20 +02:00
Ines Montani
329b61ee7b Update docs [ci skip] 2020-10-09 10:36:06 +02:00
Šarūnas Navickas
287ba94a2f Website (Universe): An entry for rita-dsl (#6138)
* Create zaibacu.md

* Add RITA-DSL entry

* Update agreement

* Fix formatting
2020-10-09 10:14:40 +02:00
Ines Montani
741796e500 Update docs [ci skip] 2020-10-08 14:31:34 +02:00
Šarūnas Navickas
047fb9f8b8
Website (Universe): An entry for rita-dsl (#6138)
* Create zaibacu.md

* Add RITA-DSL entry

* Update agreement

* Fix formatting
2020-10-06 11:19:36 +02:00
Ines Montani
01c1538c72 Integrate file readers 2020-10-02 01:36:06 +02:00
Ines Montani
0a8a124a6e Update docs [ci skip] 2020-10-01 12:15:53 +02:00
Ines Montani
e06ff8b71d Update docs [ci skip] 2020-09-26 13:18:08 +02:00
Ines Montani
d8f661c910 Update docs [ci skip] 2020-09-23 09:30:26 +02:00
Adriane Boyd
e05d6d358d Update API sidebar MorphAnalysis link 2020-09-22 09:36:37 +02:00
Adriane Boyd
fc9c78da25 Add MorphAnalysis to API sidebar 2020-09-22 09:23:47 +02:00
Adriane Boyd
9b8d0b7f90 Alphabetize API sidebars 2020-09-21 13:46:21 +02:00
Ines Montani
012b3a7096 Update docs [ci skip] 2020-09-20 17:44:58 +02:00
Ines Montani
47acb45850 Update docs [ci skip] 2020-09-13 22:30:33 +02:00
Ines Montani
8b0dabe987 Update docs [ci skip] 2020-09-12 17:05:10 +02:00
Ines Montani
2e567a47c2 Update docs and formatting 2020-09-09 21:26:10 +02:00
Ines Montani
f06eed800e
Merge pull request #6029 from explosion/master-tmp 2020-09-04 15:11:55 +02:00
Ines Montani
ba6cf9821f Replace docs analytics [ci skip] 2020-09-04 14:28:28 +02:00
Ines Montani
afdf14c717 Remove Google Analytics [ci skip] 2020-09-04 14:21:41 +02:00
Ines Montani
864a697e63 Merge branch 'develop' into master-tmp 2020-09-04 13:15:36 +02:00
Brad Jascob
2160aafec6
Updates spaCy Universe for amrlib (#6020)
* Updates spaCy Universe for amrlib

* Updates to doc based on feedback
2020-09-04 10:03:35 +02:00
Ines Montani
b5a0657fd6 "model" terminology consistency in docs 2020-09-03 13:13:03 +02:00
Ines Montani
2cc4640385 Update docs [ci skip] 2020-08-21 16:21:55 +02:00
Ines Montani
74cb6d39d0 Update docs [ci skip] 2020-08-21 16:11:38 +02:00