Commit Graph

15001 Commits

Author SHA1 Message Date
Duygu Altinok
1ee4d6ef49 Corrected broken (#9505) 2021-10-20 18:07:28 +02:00
Philip Vollet
a31a4bb7bd Add projects to spaCy Universe (#9269)
* Added spaCy Universe projects

* Added user license agreement Philip Vollet

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/meta/universe.json

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-10-20 18:07:07 +02:00
Paul O'Leary McCann
ef4d4f793b Clarify how to change base Transformer model (#9498)
* Add note about how the model name is used

* Add link to TransformersModel docs, separate paragraph

* Local link

* Revise docs

* Update website/docs/usage/embeddings-transformers.md

* Update website/docs/usage/embeddings-transformers.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-10-19 23:32:14 +02:00
Adriane Boyd
e66fddf934 Minor updates to spacy-transformers docs for v1.1.0 (#9496) 2021-10-18 14:55:22 +02:00
Adriane Boyd
404aff08e3 Update docs for spacy-transformers v1.1 data classes (#9361) 2021-10-18 14:17:48 +02:00
Sofie Van Landeghem
eaa6798c66 Docs for new spacy-trf architectures (#8954)
* use TransformerModel.v2 in quickstart

* update docs for new transformer architectures

* bump spacy_transformers to 1.1.0

* Add new arguments spacy-transformers.TransformerModel.v3

* Mention that mixed-precision support is experimental

* Describe delta transformers.Tok2VecTransformer versions

* add dot

* add dot, again

* Update some more TransformerModel references v2 -> v3

* Add mixed-precision options to the training quickstart

Disable mixed-precision training/prediction by default.

* Update setup.cfg

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Apply suggestions from code review

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/usage/embeddings-transformers.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

Co-authored-by: Daniël de Kok <me@danieldk.eu>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2021-10-18 14:17:43 +02:00
Edward
9dfb12e29f Update universe example codes (#9422)
* Update universe plugins

* Adjust azure trigger

* Add init to tests/universe

* deliberatly trying to break the universe to see if the CI catches it

* revert

Co-authored-by: svlandeg <svlandeg@github.com>
2021-10-14 09:37:05 +02:00
Paul O'Leary McCann
3b429619a8 Fix UD POS docs links (fix #9013) (#9407)
* Fix UD POS docs links (fix #9013)

The previous link seems to have been for UD v1.

* Fix link
2021-10-11 11:51:59 +02:00
Adriane Boyd
fd91e6a33c Fix types descriptions of sm and sent models (#9401) 2021-10-11 11:18:10 +02:00
Adriane Boyd
bbe4d3300a Remove traces of lexemes from vocab serialization (#9400) 2021-10-11 11:15:51 +02:00
Sofie Van Landeghem
a6ac36bcb3 Doc fixes in convert API (#9350)
* add more info on the spacy debug command

* formatting
2021-10-11 11:15:20 +02:00
Paul O'Leary McCann
23badbd55c Updating Troubleshooting Docs (#9329)
* Add link to Discussions FAQ

* Remove old FAQ entries

I think these are no longer relevant.

- no-cache-dir: affected pip versions are *very* old now
- narrow unicode: not an issue from py3.3+
- utf-8 osx: upstream bug closed in 2019

Some of the other issues are also maybe not frequent.
2021-10-01 12:31:41 +02:00
Paul O'Leary McCann
0508795d67 Fix invalid json 2021-09-30 15:24:47 +09:00
Martin Vallone
f15bb40941 Adding PhruzzMatcher to spaCy universe (#9321)
* Adding PhruzzMatcher to spaCy universe

* Fixes to make the package work properly
2021-09-30 14:26:40 +09:00
svlandeg
ec621e6853 Merge remote-tracking branch 'upstream/master' into spacy.io 2021-09-20 15:54:00 +02:00
svlandeg
e0e3e9653b Revert "raise E983 early on in docbin init"
This reverts commit f3f7afa21f.
2021-09-20 15:52:02 +02:00
svlandeg
f3f7afa21f raise E983 early on in docbin init 2021-09-20 15:49:31 +02:00
github-actions[bot]
015d439eb6
Auto-format code with black (#9234)
Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>
2021-09-20 08:49:19 +02:00
Edward
79c7c62970 Update Hammurabi example code to v3 (#9218)
* Update Hammurabi example code

* Fix typo
2021-09-16 13:35:00 +02:00
Edward
8bda39f088
Update Hammurabi example code to v3 (#9218)
* Update Hammurabi example code

* Fix typo
2021-09-16 13:32:44 +02:00
Paul O'Leary McCann
c4f0800fb8
Validate pos values when creating Doc (#9148)
* Validate pos values when creating Doc

* Add clear error when setting invalid pos

This also changes the error language slightly.

* Fix variable name

* Update spacy/tokens/doc.pyx

* Test that setting invalid pos raises an error

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-09-16 13:28:05 +02:00
Jozef Harag
865cfbc903
feat: add spacy.WandbLogger.v3 with optional run_name and entity parameters (#9202)
* feat: add `spacy.WandbLogger.v3` with optional `run_name` and `entity` parameters

* update versioning in docs

Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>
2021-09-16 12:26:41 +02:00
Paul O'Leary McCann
fd99438fb2 Make docs consistent (fix #9126) 2021-09-16 15:56:19 +09:00
Paul O'Leary McCann
1d57d78758 Make docs consistent (fix #9126) 2021-09-16 15:54:12 +09:00
Paul O'Leary McCann
9ceb8f413c
StringStore/Vocab dev docs (#9142)
* First take at StringStore/Vocab docs

Things to check:

1. The mysterious vocab members
2. How to make table of contents? Is it autogenerated?
3. Anything I missed / needs more detail?

* Update docs

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Updates based on review feedback

* Minor fix

* Move example code down

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-09-16 12:50:22 +09:00
Ines Montani
20f63e7154
Only include runtime-relevant config in package CLI dependency detection (#9211) 2021-09-15 23:16:01 +02:00
Adriane Boyd
d74870d38c
Prepare for v3.1.3 (#9200)
* Update thinc and spacy-legacy requirements

* Set version to v3.1.3
2021-09-14 11:03:51 +02:00
j-frei
5d0cc0d2ab Correct parser.py use_upper param info (#9180) 2021-09-13 09:29:11 +02:00
Paul O'Leary McCann
9c4e84d4a1 Minor typo fix in docs 2021-09-11 14:23:11 +09:00
Paul O'Leary McCann
f89e1c34c9
Minor typo fix in docs 2021-09-11 14:22:05 +09:00
Renat Shigapov
2e2d0e8701 added spaCyOpenTapioca (#9181)
* add spaCyOpenTapioca to universe

* add agreement

* fix misprint in tags
2021-09-11 13:25:25 +09:00
mylibrar
d621df6422 Update example code of forte (#9175)
Co-authored-by: Suqi Sun <suqi.sun@petuum.com>
2021-09-11 13:25:17 +09:00
Renat Shigapov
646f3a54db
added spaCyOpenTapioca (#9181)
* add spaCyOpenTapioca to universe

* add agreement

* fix misprint in tags
2021-09-11 13:16:51 +09:00
mylibrar
ee28aac68e
Update example code of forte (#9175)
Co-authored-by: Suqi Sun <suqi.sun@petuum.com>
2021-09-11 13:13:13 +09:00
j-frei
462b009648
Correct parser.py use_upper param info (#9180) 2021-09-10 16:19:58 +02:00
Adriane Boyd
aba6ce3a43
Handle spacy-legacy in package CLI for dependencies (#9163)
* Handle spacy-legacy in package CLI for dependencies

* Implement legacy backoff in spacy registry.find

* Remove unused import

* Update and format test
2021-09-08 11:46:40 +02:00
Sofie Van Landeghem
632d8d4c35
bump thinc to 8.0.9 (#9133) 2021-09-03 13:34:42 +02:00
github-actions[bot]
584fae5807
Auto-format code with black (#9130)
Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>
2021-09-03 10:47:03 +02:00
Kevin Humphreys
ca93504660
Pass alignments to Matcher callbacks (#9001)
* pass alignments to callbacks

* refactor for single callback loop

* Update spacy/matcher/matcher.pyx

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-09-02 12:58:05 +02:00
Sofie Van Landeghem
721f4554c8 matcher doc corrections (#9115)
* update error message to current UX

* clarify uppercase effect

* fix docstring
2021-09-02 09:29:44 +02:00
Sofie Van Landeghem
8895e3c9ad
matcher doc corrections (#9115)
* update error message to current UX

* clarify uppercase effect

* fix docstring
2021-09-02 09:26:33 +02:00
Robyn Speer
d60b748e3c
Fix surprises when asking for the root of a git repo (#9074)
* Fix surprises when asking for the root of a git repo

In the case of the first asset I wanted to get from git, the data I
wanted was the entire repository. I tried leaving "path" blank, which
gave a less-than-helpful error, and then I tried `path: "/"`, which
started copying my entire filesystem into the project. The path I should
have used was "".

I've made two changes to make this smoother for others:

- The 'path' within a git clone defaults to ""
- If the path points outside of the tmpdir that the git clone goes
into, we fail with an error

Signed-off-by: Elia Robyn Speer <elia@explosion.ai>

* use a descriptive error instead of a default

plus some minor fixes from PR review

Signed-off-by: Elia Robyn Speer <elia@explosion.ai>

* check for None values in assets

Signed-off-by: Elia Robyn Speer <elia@explosion.ai>

Co-authored-by: Elia Robyn Speer <elia@explosion.ai>
2021-09-01 22:52:08 +02:00
Paul O'Leary McCann
752696f134 Document Assigned Attributes of Pipeline Components (#9041)
* Add textcat docs

* Add NER docs

* Add Entity Linker docs

* Add assigned fields docs for the tagger

This also adds a preamble, since there wasn't one.

* Add morphologizer docs

* Add dependency parser docs

* Update entityrecognizer docs

This is a little weird because `Doc.ents` is the only thing assigned to,
but it's actually a bidirectional property.

* Add token fields for entityrecognizer

* Fix section name

* Add entity ruler docs

* Add lemmatizer docs

* Add sentencizer/recognizer docs

* Update website/docs/api/entityrecognizer.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/entityruler.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/tagger.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/entityruler.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update type for Doc.ents

This was `Tuple[Span, ...]` everywhere but `Tuple[Span]` seems to be
correct.

* Run prettier

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Run prettier

* Add transformers section

This basically just moves and renames the "custom attributes" section
from the bottom of the page to be consistent with "assigned attributes"
on other pages.

I looked at moving the paragraph just above the section into the
section, but it includes the unrelated registry additions, so it seemed
better to leave it unchanged.

* Make table header consistent

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-09-01 12:10:52 +02:00
Paul O'Leary McCann
ba6a37d358
Document Assigned Attributes of Pipeline Components (#9041)
* Add textcat docs

* Add NER docs

* Add Entity Linker docs

* Add assigned fields docs for the tagger

This also adds a preamble, since there wasn't one.

* Add morphologizer docs

* Add dependency parser docs

* Update entityrecognizer docs

This is a little weird because `Doc.ents` is the only thing assigned to,
but it's actually a bidirectional property.

* Add token fields for entityrecognizer

* Fix section name

* Add entity ruler docs

* Add lemmatizer docs

* Add sentencizer/recognizer docs

* Update website/docs/api/entityrecognizer.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/entityruler.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/tagger.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/entityruler.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update type for Doc.ents

This was `Tuple[Span, ...]` everywhere but `Tuple[Span]` seems to be
correct.

* Run prettier

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Run prettier

* Add transformers section

This basically just moves and renames the "custom attributes" section
from the bottom of the page to be consistent with "assigned attributes"
on other pages.

I looked at moving the paragraph just above the section into the
section, but it includes the unrelated registry additions, so it seemed
better to leave it unchanged.

* Make table header consistent

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-09-01 12:09:39 +02:00
Paul O'Leary McCann
f803a84571
Fix inference of epoch_resume (#9084)
* Fix inference of epoch_resume

When an epoch_resume value is not specified individually, it can often
be inferred from the filename. The value inference code was there but
the value wasn't passed back to the training loop.

This also adds a specific error in the case where no epoch_resume value
is provided and it can't be inferred from the filename.

* Add new error

* Always use the epoch resume value if specified

Before this the value in the filename was used if found
2021-09-01 14:17:42 +09:00
Sofie Van Landeghem
a17b06d18b
allow typer 0.4 (#9089) 2021-08-31 20:53:51 +10:00
svlandeg
3f16c45281 Merge branch 'spacy.io' of https://github.com/explosion/spaCy into spacy.io 2021-08-31 10:58:40 +02:00
Davide Fiocco
5c88998b9d Fix point typo on docbin docs (#9097) 2021-08-31 10:58:31 +02:00
Ines Montani
753149bc88 Update references to contributor agreement [ci skip] 2021-08-31 10:58:22 +02:00
Davide Fiocco
1dd69be1f1
Fix point typo on docbin docs (#9097) 2021-08-31 10:55:44 +02:00