Commit Graph

407 Commits

Author SHA1 Message Date
Suqi Sun
3901507df8 Update pip 2021-06-30 16:44:43 -04:00
Suqi Sun
61c868ed75 Update pip and code example 2021-06-30 14:49:51 -04:00
Suqi Sun
4331c40b78 Add forte to universe.json 2021-06-29 16:17:22 -04:00
Nick Sorros
bb781ae7f7
Remove extra parenthesis from the example for spacy-streamlit (#8527) 2021-06-28 14:03:31 +02:00
Kevin
1a3e7cc5ef Updated PyATE syntax to fit spaCy V3 2021-06-26 17:52:41 -07:00
Matthew Honnibal
f9946154d9
Add SpanCategorizer component (#6747)
* Draft spancat model

* Add spancat model

* Add test for extract_spans

* Add extract_spans layer

* Upd extract_spans

* Add spancat model

* Add test for spancat model

* Upd spancat model

* Update spancat component

* Upd spancat

* Update spancat model

* Add quick spancat test

* Import SpanCategorizer

* Fix SpanCategorizer component

* Import SpanGroup

* Fix span extraction

* Fix import

* Fix import

* Upd model

* Update spancat models

* Add scoring, update defaults

* Update and add docs

* Fix type

* Update spacy/ml/extract_spans.py

* Auto-format and fix import

* Fix comment

* Fix type

* Fix type

* Update website/docs/api/spancategorizer.md

* Fix comment

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Better defense

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Fix labels list

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update spacy/ml/extract_spans.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update spacy/pipeline/spancat.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Set annotations during update

* Set annotations in spancat

* fix imports in test

* Update spacy/pipeline/spancat.py

* replace MaxoutLogistic with LinearLogistic

* fix config

* various small fixes

* remove set_annotations parameter in update

* use our beloved tupley format with recent support for doc.spans

* bugfix to allow renaming the default span_key (scores weren't showing up)

* use different key in docs example

* change defaults to better-working parameters from project (WIP)

* register spacy.extract_spans.v1 for legacy purposes

* Upd dev version so can build wheel

* layers instead of architectures for smaller building blocks

* Update website/docs/api/spancategorizer.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/spancategorizer.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Include additional scores from overrides in combined score weights

* Parameterize spans key in scoring

Parameterize the `SpanCategorizer` `spans_key` for scoring purposes so
that it's possible to evaluate multiple `spancat` components in the same
pipeline.

* Use the (intentionally very short) default spans key `sc` in the
  `SpanCategorizer`
* Adjust the default score weights to include the default key
* Adjust the scorer to use `spans_{spans_key}` as the prefix for the
  returned score
* Revert addition of `attr_name` argument to `score_spans` and adjust
  the key in the `getter` instead.

Note that for `spancat` components with a custom `span_key`, the score
weights currently need to be modified manually in
`[training.score_weights]` for them to be available during training. To
suppress the default score weights `spans_sc_p/r/f` during training, set
them to `null` in `[training.score_weights]`.

* Update website/docs/api/scorer.md

* Fix scorer for spans key containing underscore

* Increment version

* Add Spans to Evaluate CLI (#8439)

* Add Spans to Evaluate CLI

* Change to spans_key

* Add spans per_type output

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Fix spancat GPU issues (#8455)

* Fix GPU issues

* Require thinc >=8.0.6

* Switch to glorot_uniform_init

* Fix and test ngram suggester

* Include final ngram in doc for all sizes
* Fix ngrams for docs of the same length as ngram size
* Handle batches of docs that result in no ngrams
* Add tests

Co-authored-by: Ines Montani <ines@ines.io>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Nirant <NirantK@users.noreply.github.com>
2021-06-24 12:35:27 +02:00
Ines Montani
bc93c34f54 Add "New in v3.1" guide 2021-06-22 15:23:18 +10:00
Adriane Boyd
5646fcbe46 Merge remote-tracking branch 'upstream/develop' into chore/develop-into-master-v3.1 2021-06-15 15:05:17 +02:00
Adriane Boyd
507422149f
Various docs updates for v3.0 (#8353)
* Update cats score names in Scorer API docs

* Refer to performance in meta

* Update package naming/versions, lemmatizer details

* Minor formatting fixes

* Provide more explanation for cats_score_desc

* Provide language-specific lemmatizer defaults in API docs

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
2021-06-14 12:19:36 +02:00
Adriane Boyd
63d748f80e
Add Catalan and Danish trf to website models (#8378) 2021-06-14 09:50:13 +02:00
Ines Montani
7f0f674a1b Fix universe.json and auto-format [ci skip] 2021-06-14 10:18:06 +10:00
Francisco Aranda
0a1a4c665d
update spacy-wordnet code example (#8327)
* update spacy-wordnet code example

- include spaCy 2.x and 3.x init alternatives
- upgrade recognai logo

* fix escape chars
2021-06-10 21:53:11 +02:00
Paul O'Leary McCann
5aba213349 Fix skweak Github URL
Github entry should not contain url, just user/repo
2021-05-31 18:00:43 +09:00
Kristian Boda
dc8d8d15d2
Add hmrb to spaCy Universe (#8129)
* docs: add hmrb to spacy universe

* docs: add sentence on spacy versions

* docs: update description and images

* misc: add spaCy Contributor Agreement
2021-05-31 18:40:48 +10:00
Julien Salinas
c496f78245 Add NLP Cloud to Universe. 2021-05-14 11:13:44 +02:00
Frederic R. Hopp
c5962b9fba
Update universe.json
fixed typo
2021-05-13 07:40:05 -07:00
Frederic R. Hopp
a9ca221e03
Update universe.json
Added more detailed description to eMFDscore project
2021-05-12 09:20:17 -07:00
Frederic R. Hopp
7bba9cdc14
Update universe.json 2021-05-11 19:18:19 -07:00
Jeno Pizarro
5cf76ab608
Update negspacy example code for spaCy 3.0 (#8022) 2021-05-07 09:33:21 +02:00
meghanabhange
debaab7021
Update details in universe denomme | Multilingual Name Detection (#7982)
* Add denomme

* spaCy contributor agreement

* Update install and thumb

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2021-05-05 17:12:13 +02:00
meghanabhange
49ff1126bf
Project Idea : denomme | Multilingual Name Detection (#7845)
* Add denomme

* spaCy contributor agreement

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2021-04-22 08:48:17 +02:00
Sam Edwardes
b8c6c10c6f
Added a logo to spaCyTextBlob (#7818)
* Added a logo to spaCyTextBlob

* Updated to better thumb
2021-04-22 08:41:55 +02:00
Diego Palma
bbade153ed
Add TRUNAJOD to spaCy universe. (#7754)
* Add TRUNAJOD to spaCy universe.

* Add trunajod logo and thumb.

Co-authored-by: Diego <dpalma@evernote.com>
2021-04-22 08:40:28 +02:00
Ines Montani
a9e5ae9b5c Auto-format [ci skip] 2021-04-22 10:58:05 +10:00
Pierre Lison
debfb46088 adding skweak to the SpaCy universe 2021-04-22 00:58:09 +02:00
hudsonr
2722424ec5 Added universe entry for Coreferee 2021-04-19 14:28:06 +02:00
Jaidev Deshpande
93ee74a0a6
Add Numerizer to SpaCy universe (#7650)
Numerizer is a spaCy extension that converts numbers written in natural language
into numeric strings.
2021-04-05 19:02:27 +02:00
Sam Edwardes
f6ad4684bd
Updates to universe.json for spaCyTextBlob (#7647)
* Updates to universe.json for spaCyTextBlob

Updated the documentation for spaCy 3.0.

* SamEdwardes.md

* Update SamEdwardes.md
2021-04-04 20:17:57 +02:00
vincent d warmerdam
8b3eec6e62
Add Tokenwiser to Projects (#7541)
* Add tokenwiser

* Update universe.json
2021-04-01 14:39:36 +02:00
Sofie Van Landeghem
59c2069eb1
Legacy docs (#7601)
* document legacy Tok2Vec architectures

* add TextCatEnsemble.v1 legacy documentation

* Separate legacy section in side bar
2021-03-30 12:43:14 +02:00
Paolo Arduin
00e59be966 Add SpikeX to spaCy universe 2021-03-16 18:22:03 +01:00
vincent d warmerdam
1b0d413e45
Removed Languages that were listed twice on Docs (#7272)
* removed languages that were listed twice

* sorted

* d0h

* the d0h strikes back when you dont hit save
2021-03-05 14:31:15 +01:00
Ines Montani
d2c515354b Auto-format [ci skip] 2021-02-24 22:37:32 +11:00
Ines Montani
9e8a7e08c1
Merge pull request #7115 from SergeyShk/ruts [ci skip] 2021-02-24 22:37:00 +11:00
Shkarin Sergey
22706ec9fb Fixed universe.json 2021-02-20 08:02:38 +03:00
Ines Montani
fc4fb6eb3a Make v2.x docs more prominent [ci skip] 2021-02-17 23:42:27 +11:00
Rajat
4e80ef3abb
updated code eg & description of contextualSpellCheck (#7096) 2021-02-17 13:26:43 +01:00
Shkarin Sergey
abac5dc203
Update universe.json 2021-02-15 15:01:46 +03:00
Ines Montani
4b729660bd
Merge pull request #7051 from MartinoMensio/dbpedia-spotlight [ci skip]
added spacy-dbpedia-spotlight
2021-02-14 14:06:08 +11:00
Ines Montani
06e66d4ced Update languages.json [ci skip] 2021-02-13 12:33:17 +11:00
Martino Mensio
6c0c3d5ddc
added spacy-dbpedia-spotlight 2021-02-12 19:11:35 +01:00
Ines Montani
6a683970ea Update Binder meta [ci skip] 2021-01-31 15:43:08 +11:00
Ines Montani
ae07416fda Merge branch 'website/v3-launch' into develop 2021-01-30 20:31:06 +11:00
Ines Montani
d3350afe45 Update docs and add support for legacy style 2021-01-30 17:43:12 +11:00
Ines Montani
230e651ad6 Merge branch 'develop' into master-tmp 2021-01-27 13:26:29 +11:00
muratjumashev
7d0154a36e Added language meta data 2021-01-25 00:42:19 +06:00
Adriane Boyd
7cd5c9e098 Add xx_sent_ud_sm model to website 2021-01-19 09:02:35 +01:00
Adriane Boyd
e8f6400923 Update languages for website
* Add Macedonian
* Add Russian dependencies
* Switch Chinese dependency to spacy-pkuseg
2021-01-18 14:09:34 +01:00
Ines Montani
09cacbb7ee Fix website [ci skip] 2021-01-18 11:37:04 +11:00
Adriane Boyd
0c936004d1 Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-rc3 2021-01-14 11:49:58 +01:00
Matthew Honnibal
f277bfdf0f
Add SpanGroup and Graph container types to represent arbitrary annotations (#6696)
* Draft out initial Spans data structure

* Initial span group commit

* Basic span group support on Doc

* Basic test for span group

* Compile span_group.pyx

* Draft addition of SpanGroup to DocBin

* Add deserialization for SpanGroup

* Add tests for serializing SpanGroup

* Fix serialization of SpanGroup

* Add EdgeC and GraphC structs

* Add draft Graph data structure

* Compile graph

* More work on Graph

* Update GraphC

* Upd graph

* Fix walk functions

* Let Graph take nodes and edges on construction

* Fix walking and getting

* Add graph tests

* Fix import

* Add module with the SpanGroups dict thingy

* Update test

* Rename 'span_groups' attribute

* Try to fix c++11 compilation

* Fix test

* Update DocBin

* Try to fix compilation

* Try to fix graph

* Improve SpanGroup docstrings

* Add doc.spans to documentation

* Fix serialization

* Tidy up and add docs

* Update docs [ci skip]

* Add SpanGroup.has_overlap

* WIP updated Graph API

* Start testing new Graph API

* Update Graph tests

* Update Graph

* Add docstring

Co-authored-by: Ines Montani <ines@ines.io>
2021-01-14 17:30:41 +11:00
Antonio Miras
b4bd8f347a
spaCy Universe: New project; SpacyDotNet (#6702)
* Universe: SpacyDotNet a .NET Core spaCy wrapper

* Signed contributor agreement

Co-authored-by: Antonio Miras <antonio@amiras.net>
2021-01-13 12:47:30 +11:00
Jeno Pizarro
a6fe35a0f9
Update universe.json 2020-12-15 21:53:20 -05:00
Jeno Pizarro
343a44abe9 Merge branch 'master' of https://github.com/explosion/spaCy 2020-12-15 21:49:46 -05:00
Ines Montani
85ca8c2bdd Merge branch 'master' into develop 2020-12-11 13:44:41 +11:00
Ines Montani
76cfd89dea Update site.json 2020-12-11 10:19:42 +11:00
Ines Montani
43a69eecb7 Update site.json 2020-12-11 10:05:21 +11:00
svlandeg
d156b423ae remove gitter and reddit links 2020-12-10 20:41:02 +01:00
Adriane Boyd
724831b066 Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master
* Update Macedonian for v3
* Update Turkish for v3
2020-11-25 11:49:34 +01:00
Yusuke Mori
e3ac90b035
Avoid a SyntaxError in self-attentive-parser (#6428)
* Avoid a SyntaxError in self-attentive-parser

Fix a usage of quotation marks in the example of spaCy Universe self-attentive-parser

* Create forest1988.md

Fill in the spaCy contributor agreement
2020-11-22 21:59:37 +01:00
M. Revuelta Espinosa
51232ffb9e
Update universe.json (include PatternOmatic) (#6399)
Request to include PatternOmatic in spaCy Universe

Adds @revuel to contributors
2020-11-19 13:15:50 +01:00
Adriane Boyd
3cf6479467 Fix JSON in #6395 2020-11-17 15:25:41 +01:00
Sam Edwardes
78913a4f95
Added spaCyTextBlob to universe.json (#6395) 2020-11-17 14:38:34 +01:00
Alec Chapman
204c7c8a00 fix thumbnail link to be github raw url 2020-11-01 07:53:48 -07:00
Alec Chapman
73d22d96ff add medspacy to universe and fix example w/ cov-bsv 2020-10-29 07:53:56 -06:00
Adriane Boyd
8cc5ed6771 Add Macedonian to website languages 2020-10-29 08:49:56 +01:00
Adriane Boyd
4dd86306e9
Add Nepali to supported languages on website (#6315) 2020-10-28 16:32:07 +01:00
Kunal Sharma
01aec7a313
Adding MindMeld to Universe JSON (#6275)
* Adding Mindmeld to Universe JSON

Mindmeld is a conversational AI platform for deep-domain voice interfaces and chatbots. https://www.mindmeld.com/

* Signing contribution agreement.

Co-authored-by: kunshar2 <kunshar2@cisco.com>
2020-10-21 18:42:11 +02:00
Adriane Boyd
e896803792 Add and update website license links 2020-10-16 17:01:52 +02:00
Ines Montani
050aa1e0e2 Update languages.json [ci skip] 2020-10-14 20:51:50 +02:00
Ines Montani
a966c271f7 Update models docs [ci skip] 2020-10-14 20:50:23 +02:00
Ines Montani
7c52def5da
Merge pull request #6227 from adrianeboyd/chore/update-3.0.0a36-from-master 2020-10-09 10:49:20 +02:00
Ines Montani
329b61ee7b Update docs [ci skip] 2020-10-09 10:36:06 +02:00
Šarūnas Navickas
287ba94a2f Website (Universe): An entry for rita-dsl (#6138)
* Create zaibacu.md

* Add RITA-DSL entry

* Update agreement

* Fix formatting
2020-10-09 10:14:40 +02:00
Ines Montani
741796e500 Update docs [ci skip] 2020-10-08 14:31:34 +02:00
Šarūnas Navickas
047fb9f8b8
Website (Universe): An entry for rita-dsl (#6138)
* Create zaibacu.md

* Add RITA-DSL entry

* Update agreement

* Fix formatting
2020-10-06 11:19:36 +02:00
Ines Montani
01c1538c72 Integrate file readers 2020-10-02 01:36:06 +02:00
Ines Montani
0a8a124a6e Update docs [ci skip] 2020-10-01 12:15:53 +02:00
Ines Montani
e06ff8b71d Update docs [ci skip] 2020-09-26 13:18:08 +02:00
Ines Montani
d8f661c910 Update docs [ci skip] 2020-09-23 09:30:26 +02:00
Adriane Boyd
e05d6d358d Update API sidebar MorphAnalysis link 2020-09-22 09:36:37 +02:00
Adriane Boyd
fc9c78da25 Add MorphAnalysis to API sidebar 2020-09-22 09:23:47 +02:00
Adriane Boyd
9b8d0b7f90 Alphabetize API sidebars 2020-09-21 13:46:21 +02:00
Ines Montani
012b3a7096 Update docs [ci skip] 2020-09-20 17:44:58 +02:00
Ines Montani
47acb45850 Update docs [ci skip] 2020-09-13 22:30:33 +02:00
Ines Montani
8b0dabe987 Update docs [ci skip] 2020-09-12 17:05:10 +02:00
Ines Montani
2e567a47c2 Update docs and formatting 2020-09-09 21:26:10 +02:00
Ines Montani
f06eed800e
Merge pull request #6029 from explosion/master-tmp 2020-09-04 15:11:55 +02:00
Ines Montani
ba6cf9821f Replace docs analytics [ci skip] 2020-09-04 14:28:28 +02:00
Ines Montani
afdf14c717 Remove Google Analytics [ci skip] 2020-09-04 14:21:41 +02:00
Ines Montani
864a697e63 Merge branch 'develop' into master-tmp 2020-09-04 13:15:36 +02:00
Brad Jascob
2160aafec6
Updates spaCy Universe for amrlib (#6020)
* Updates spaCy Universe for amrlib

* Updates to doc based on feedback
2020-09-04 10:03:35 +02:00
Ines Montani
b5a0657fd6 "model" terminology consistency in docs 2020-09-03 13:13:03 +02:00
Ines Montani
2cc4640385 Update docs [ci skip] 2020-08-21 16:21:55 +02:00
Ines Montani
74cb6d39d0 Update docs [ci skip] 2020-08-21 16:11:38 +02:00
Ines Montani
82f0e20318 Update docs and consistency [ci skip] 2020-08-18 14:39:40 +02:00
Ines Montani
728fec0194 Update docs [ci skip] 2020-08-18 00:49:19 +02:00
Ines Montani
3ae5e02f4f Update docs, types and API consistency 2020-08-17 16:45:24 +02:00
Adriane Boyd
b8d0c23857 Add AttributeRuler API docs
With additional minor updates to AttributeRuler docstrings.
2020-08-07 12:43:23 +02:00
Bram Vanroy
9e45d064bb
Update universe details spacy_conll (#5871) 2020-08-05 14:34:12 +02:00
Ines Montani
e0ffe36e79 Update docstrings, docs and types 2020-07-29 11:36:42 +02:00
Ines Montani
d8b519c23c API docs, docstrings and argument consistency 2020-07-27 18:11:45 +02:00
Adriane Boyd
2880d8a555
Normalize spelling for spaCy (#5822) 2020-07-27 10:09:33 +02:00
Martino Mensio
2f6b8132ef
Sentence transformers added to spaCy universe (#5814)
* fix details for spacy-universal-sentence-encoder

* added sentence-transformers
2020-07-27 09:44:33 +02:00
Nipun Sadvilkar
a66ad89fcb
✏️ typo in pysbd code example (#5821) 2020-07-27 09:43:39 +02:00
Ines Montani
7adbaf9a5b Update docs [ci skip] 2020-07-27 00:29:45 +02:00
Adriane Boyd
41525901ef Move MorphAnalysis to Other section 2020-07-23 08:58:22 +02:00
Adriane Boyd
14df00ae98 Add Morphology and MorphAnalsysis API docs
Add initial draft of `Morphology` and `MorphAnalysis` API docs.
2020-07-21 10:33:46 +02:00
Ines Montani
644074b954 Merge branch 'develop' into master-tmp 2020-07-20 14:58:04 +02:00
Alec Chapman
a8978ca285
Add VA COVID-19 NLP project to spaCy Universe (#5777)
* Update universe.json

Add cov-bsv to "resources"

* Update universe.json

* add contributor agreement
2020-07-19 13:35:31 +02:00
Ines Montani
68fade8f76 Add Plausible [ci skip] 2020-07-19 00:02:29 +02:00
Ines Montani
6f4e4aceb3 Add Plausible [ci skip] 2020-07-18 23:50:29 +02:00
Ines Montani
9ae4040183 Update API docs 2020-07-08 13:34:35 +02:00
gandersen101
893133873d Fix quote issue in spaczz universe.json 2020-07-07 19:16:28 -05:00
Ines Montani
109849bd31 Fix and update universe.json [ci skip] 2020-07-07 21:12:28 +02:00
gandersen101
9097549227
Adding spaczz package to universe.json (#5717)
* Adding spaczz package to universe.json

* Adding contributor agreement.
2020-07-07 20:55:24 +02:00
Jonathan Besomi
546f3d10d4
Add texthero to universe.json (#5716)
* Add texthero to universe.json

* Add spaCy contributor Agreement
2020-07-07 20:54:22 +02:00
Ines Montani
a35236e5f0 Update v3 docs WIP [ci skip] 2020-07-06 15:57:44 +02:00
Ines Montani
63247cbe87 Update v3 docs [ci skip] 2020-07-05 16:11:16 +02:00
Ines Montani
1e0d54edd1 Update docs 2020-07-04 14:23:10 +02:00
Ines Montani
fe224dc2dd Merge branch 'develop' into nightly.spacy.io 2020-07-03 16:48:27 +02:00
Ines Montani
06f1ecb308 Update v3 docs 2020-07-03 16:48:21 +02:00
Ines Montani
cdf9ee1716 Add stub for Example API docs [ci skip] 2020-07-03 15:46:10 +02:00
Ines Montani
3dff412f58 Merge branch 'nightly.spacy.io' into develop [ci skip] 2020-07-01 21:33:47 +02:00
Ines Montani
fe4cfd0632 Start updating website for v3 [ci skip] 2020-07-01 21:26:39 +02:00
Ines Montani
997f6eeca7 Adjust nightly site url [ci skip] 2020-07-01 14:42:59 +02:00
Ines Montani
dc6d9c2fac Auto-infer nightly state from branch 2020-07-01 14:05:11 +02:00
Ines Montani
5e24b8d481 Set to nightly 2020-07-01 13:02:30 +02:00
Ines Montani
26df4efa94 Add new in v3.0 2020-07-01 13:02:17 +02:00
Ines Montani
414dc7ace1 Merge branch 'spacy.io' into spacy.io-develop 2020-07-01 11:47:47 +02:00
Ines Montani
52728d8fa3 Merge branch 'develop' into master-tmp 2020-06-20 15:52:00 +02:00
Ines Montani
19b9ea0436 Fix languages.json 2020-06-16 18:34:11 +02:00
Ines Montani
41003a5117 Update Binder version [ci skip] 2020-06-16 17:41:23 +02:00
Ines Montani
fd89f44c0c Update Binder URL [ci skip] 2020-06-16 17:34:26 +02:00
Adriane Boyd
d5110ffbf2
Documentation updates for v2.3.0 (#5593)
* Update website models for v2.3.0

* Add docs for Chinese word segmentation

* Tighten up Chinese docs section

* Merge branch 'master' into docs/v2.3.0 [ci skip]

* Merge branch 'master' into docs/v2.3.0 [ci skip]

* Auto-format and update version

* Update matcher.md

* Update languages and sorting

* Typo in landing page

* Infobox about token_match behavior

* Add meta and basic docs for Japanese

* POS -> TAG in models table

* Add info about lookups for normalization

* Updates to API docs for v2.3

* Update adding norm exceptions for adding languages

* Add --omit-extra-lookups to CLI API docs

* Add initial draft of "What's New in v2.3"

* Add new in v2.3 tags to Chinese and Japanese sections

* Add tokenizer to migration section

* Add new in v2.3 flags to init-model

* Typo

* More what's new in v2.3

Co-authored-by: Ines Montani <ines@ines.io>
2020-06-16 15:37:35 +02:00
Martino Mensio
de00f967ce
adding spacy-universal-sentence-encoder (#5534)
* adding spacy-universal-sentence-encoder

* update affiliation

* updated code example
2020-06-08 20:26:30 +02:00
Rajat
8b8efa1b42
update spacy universe with my project (#5497)
* added contextualSpellCheck in spacy universe meta

* removed extra formatting by code

* updated with permanent links

* run json linter used by spacy

* filled SCA

* updated the description
2020-05-25 11:30:23 +02:00
Sofie Van Landeghem
ae1c179f3a
Remove the nested quote 2020-05-23 17:58:19 +02:00
Ines Montani
ee027de032 Update universe and display of videos [ci skip] 2020-05-21 21:54:23 +02:00
Ines Montani
24f72c669c Merge branch 'develop' into master-tmp 2020-05-21 18:39:06 +02:00
Kevin Lu
c7c4cd5fe1
Changed pyate code example in universe.json 2020-05-20 09:11:32 -07:00
Kevin Lu
0a5b140235
Update universe.json 2020-05-19 20:12:21 -07:00
Travis Hoppe
d4cc18b746
Added author information for NLPre (#5414)
* Add author links for NLPre and update category

* Add contributor statement
2020-05-08 11:28:54 +02:00
Ines Montani
63885c1836 Remove u string and auto-format [ci skip] 2020-04-29 12:54:57 +02:00
Ines Montani
a77754120d
Merge pull request #5177 from nlptechbook/patch-5 2020-04-29 12:52:21 +02:00
Ines Montani
1cbb272a6b
Update website/meta/universe.json 2020-04-29 12:51:44 +02:00
Ines Montani
732629b0dd
Update website/meta/universe.json 2020-04-29 12:51:37 +02:00
Louis Guitton
a27c4014f5
Add mlflow to spaCy universe (#5352)
* Add mlflow to universe

* Use mlflow black logo
2020-04-29 10:18:03 +02:00
Sébastien Harinck
688a328668 docs(website): fix issue on example in spacy-lookup 2020-04-15 16:47:29 +02:00
Thomas Thiebaud
1eef60c658
Add spacy_fastlang to universe (#5271)
* Add spacy_fastlang to universe

* Sign SCA
2020-04-15 13:50:46 +02:00
Sofie Van Landeghem
7ad0fcf01d
fix json (#5267) 2020-04-08 12:58:09 +02:00
vincent d warmerdam
f329d5663a
add "whatlies" to spaCy universe (#5252)
* Add "whatlies"

We're releasing it on our side officially on the 16th of April. If possible, let's announce around the same time :)

* sign contributor thing

* Added fancy gif

as the image

* Update universe.json

Spellin error and spaCy clarification.
2020-04-06 11:29:30 +02:00
nlptechbook
ddf3c2430d
Update universe.json 2020-04-03 12:10:03 -04:00
Sofie Van Landeghem
1137420840
Small doc fixes (#5250)
* fix link

* torchtext instead tochtext
2020-04-03 13:01:43 +02:00
nlptechbook
b52e1ab677
Update universe.json
A bot powered by Clarifai Predict API and spaCy. Can be found in Telegram messenger at @pic2phrase_bot
2020-03-21 11:39:15 -04:00
Baciccin
3b53617a69 Add Ligurian language 2020-03-19 21:37:01 -07:00
Ines Montani
80e7e1347e Update universe.json [ci skip] 2020-03-17 22:21:34 +01:00
Ines Montani
eda6eff8b1 Update universe.json [ci skip] 2020-03-17 22:19:29 +01:00
Ines Montani
16e7301d34
Merge pull request #5161 from pmbaumgartner/master
add gobbli to spacy-universe 🥳
2020-03-17 22:18:30 +01:00
Peter B
b04057c204 add mentions of spaCy use 2020-03-17 15:03:43 -04:00
Ines Montani
b2b01a5c8b Update universe.json [ci skip] 2020-03-17 19:53:31 +01:00
Peter B
d2ffb406ad add gobbli to spacy-universe 🥳 2020-03-17 08:30:29 -04:00
nihil
9cde7eb08c add spacy_syllables to universe + sign contributor agreement 2020-03-13 18:09:42 +01:00
Ines Montani
1d6aec805d Fix formatting and update docs for v2.2.4 2020-03-09 11:17:20 +01:00
Ines Montani
4890db6339 Auto-format and fix image [ci skip] 2020-02-23 13:56:50 +01:00
nlptechbook
979a3fd1f5
Update universe.json (#5022)
e-book is available from https://nostarch.com/NLPPython
2020-02-15 15:44:55 +01:00
Omri Mendels
6ff947e1f9
Added presidio-research to universe.json (#4950)
* Added presidio-research to universe.json

Added a reference to Presidio Research, the data-science toolbox for Microsoft Presidio.

* Updated url
2020-02-03 12:57:55 +01:00
Paco Nathan
49fefb6139 Submitting PyTextRank for inclusion in the spaCy uniVerse (#4942)
* submitting PyTextRank for consideration of including in the spaCy uniVerse

* including SCA
2020-01-28 11:37:54 +01:00
Bram Vanroy
718704022a Changes to spacy_conll in universe (#4914)
* Update information on spacy_conll

* Typo fix
2020-01-16 01:56:39 +01:00
Ines Montani
1b838d1313 Divide models into core and starters [ci skip] 2019-12-21 14:10:22 +01:00
Ines Montani
c466e02466 Update universe [ci skip] 2019-12-13 15:57:39 +01:00
Paul O'Leary McCann
f0e3e606a6 Replace python-mecab3 with fugashi for Japanese (#4621)
* Switch from mecab-python3 to fugashi

mecab-python3 has been the best MeCab binding for a long time but it's
not very actively maintained, and since it's based on old SWIG code
distributed with MeCab there's a limit to how effectively it can be
maintained.

Fugashi is a new Cython-based MeCab wrapper I wrote. Since it's not
based on the old SWIG code it's easier to keep it current and make small
deviations from the MeCab C/C++ API where that makes sense.

* Change mecab-python3 to fugashi in setup.cfg

* Change "mecab tags" to "unidic tags"

The tags come from MeCab, but the tag schema is specified by Unidic, so
it's more proper to refer to it that way.

* Update conftest

* Add fugashi link to external deps list for Japanese
2019-11-23 14:31:04 +01:00
richardpaulhudson
8d06386e1e Update to Holmes Universe entry (#4679)
* Updated Universe entry for Holmes

* Correction

* Updated model name

* Updated wording
2019-11-21 16:23:24 +01:00
Ines Montani
4b95587ad4 Update universe.json [ci skip] 2019-11-04 13:55:55 +01:00
Yash Patadia
0c396aeed4 add dframcy to universe.json (#4580) 2019-11-04 13:53:23 +01:00
Ines Montani
726c5dd306 Update universe.json [ci skip] 2019-10-30 13:29:00 +01:00
Neel Kamath
6c036ab57d Add "spaCy Server" to spaCy Universe (#4553)
* Add "spaCy Server" to spaCy Universe

* Accept the spaCy Contributor Agreement
2019-10-30 13:20:46 +01:00
Nipun Sadvilkar
2a5e71232b project: pySBD - Python Sentence Boundary Disambiguation (#4455)
*   project: pySBD - Python Sentence Boundary Disambiguation

* 📝  Update links and description

* 🐛  Fix missing comma

* Update universe.json

pysbd as a spacy component through entrypoints

* 🚨  Fix universe.json

* 📝  Update code_example
2019-10-30 12:13:29 +01:00
Ines Montani
1180304449 Update languages.json [ci skip] 2019-10-26 13:51:42 +02:00
Ines Montani
388ea03065 Update universe.json [ci skip] 2019-10-22 14:54:47 +02:00
Kabir Khan
8a7a30ea1d Add cookiecutter-spacy-fastapi to spacy universe (#4498) 2019-10-22 14:50:40 +02:00
Julin S
3ee15fce0d Update information about Rasa (#4492)
Rasa has been updated and rasa core and rasa nlu have been merged.
2019-10-22 14:32:31 +02:00
Ines Montani
8f76d6c9ef Update transformer model details [ci skip] 2019-10-08 15:39:38 +02:00
Ines Montani
12a941d841 Update binder version [ci skip] 2019-10-02 16:47:01 +02:00
Ines Montani
b6670bf0c2 Use consistent spelling 2019-10-02 10:37:39 +02:00
Ines Montani
61263e2fbc Update universe.json [ci skip] 2019-09-30 13:49:44 +02:00
Ines Montani
3624153591 Update languages.json [ci skip] 2019-09-27 15:15:41 +02:00
Ajinkya Kale
975aebd7e4 typo fix for wordnet_annotator (#4326) 2019-09-27 11:52:53 +02:00
Eric Semeniuc
09816f8323 update sense2vec version (#4320) 2019-09-25 12:17:54 +02:00
Sofie Van Landeghem
42340740e3 update neuralcoref example (#4317) 2019-09-24 10:47:17 +02:00
Ines Montani
d84763727c Remove unused setting [ci skip] 2019-09-18 21:24:14 +02:00
Ines Montani
dd1810f05a Update DocBin and add docs 2019-09-18 20:23:21 +02:00
Ines Montani
23e28e2844 Merge branch 'master' into develop 2019-09-15 17:57:09 +02:00
Ines Montani
c7e4ea7154 Update examples and languages.json [ci skip] 2019-09-15 17:56:40 +02:00
Ines Montani
16c2522791 Merge branch 'master' into develop 2019-09-14 16:42:01 +02:00
Ines Montani
86befc80bf WIP: Add v2.2 page [ci skip] 2019-09-14 16:41:48 +02:00
Ines Montani
76d26a3d5e Update site.json [ci skip] 2019-09-14 16:32:24 +02:00
Ines Montani
fe87ccc8d1 Update languages.json [ci skip] 2019-09-14 16:23:50 +02:00
Ines Montani
82c16b7943 Remove u-strings and fix formatting [ci skip] 2019-09-12 16:11:15 +02:00
Ines Montani
10257f3131 Document Lookups [ci skip] 2019-09-12 14:00:14 +02:00