Commit Graph

15266 Commits

Author SHA1 Message Date
Sofie Van Landeghem
a17b06d18b
allow typer 0.4 (#9089) 2021-08-31 20:53:51 +10:00
svlandeg
3f16c45281 Merge branch 'spacy.io' of https://github.com/explosion/spaCy into spacy.io 2021-08-31 10:58:40 +02:00
Davide Fiocco
5c88998b9d Fix point typo on docbin docs (#9097) 2021-08-31 10:58:31 +02:00
Ines Montani
753149bc88 Update references to contributor agreement [ci skip] 2021-08-31 10:58:22 +02:00
Davide Fiocco
1dd69be1f1
Fix point typo on docbin docs (#9097) 2021-08-31 10:55:44 +02:00
Ines Montani
1a86d545af Update references to contributor agreement [ci skip] 2021-08-31 10:03:38 +10:00
Sofie Van Landeghem
5af88427a2
Dev docs: listeners (#9061)
* Start Listeners documentation

* intro tabel of different architectures

* initialization, linking, dim inference

* internal comm (WIP)

* expand internal comm section

* frozen components and replacing listeners

* various small fixes

* fix content table

* fix link
2021-08-30 14:56:35 +02:00
Adriane Boyd
1e9b4b55ee
Pass overrides to subcommands in workflows (#9059)
* Pass overrides to subcommands in workflows

* Add missing docstring
2021-08-30 09:23:54 +02:00
Meenal Jhajharia
db42ba5240 benepar usage example has deprecated imports 2021-08-29 14:44:18 +09:00
Paul O'Leary McCann
6ff8d90070
Merge pull request #9081 from mjhajharia/patch-1
benepar usage example has deprecated imports
2021-08-29 14:41:52 +09:00
Meenal Jhajharia
2613f0e98f
benepar usage example has deprecated imports 2021-08-28 16:35:58 +05:30
Sofie Van Landeghem
689535c264 config is not Optional (#9024) 2021-08-27 11:53:54 +02:00
Sofie Van Landeghem
1e974de837
config is not Optional (#9024) 2021-08-27 11:44:31 +02:00
github-actions[bot]
fb9c31fbda
Auto-format code with black (#9065)
Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>
2021-08-27 11:42:27 +02:00
Sofie Van Landeghem
8c1d86ea92 Document use-case of freezing tok2vec (#8992)
* update error msg

* add sentence to docs

* expand note on frozen components
2021-08-26 09:53:29 +02:00
Sofie Van Landeghem
31c0a75e6d fix docs for Span constructor arguments (#9023) 2021-08-26 09:52:59 +02:00
Sofie Van Landeghem
4d39430b82
Document use-case of freezing tok2vec (#8992)
* update error msg

* add sentence to docs

* expand note on frozen components
2021-08-26 09:50:35 +02:00
Sofie Van Landeghem
94fb840443
fix docs for Span constructor arguments (#9023) 2021-08-25 16:06:22 +02:00
David Strouk
31e9b126a0
Fix verbs list in lang/fr/tokenizer_exceptions.py (#9033) 2021-08-25 15:55:09 +02:00
Ines Montani
4cd052e81d
Include component factories in third-party dependencies resolver (#9009)
* Include component factories in third-party dependencies resolver

* Increment catalogue and update test
2021-08-25 14:58:01 +02:00
svlandeg
fb8c2f794a Merge remote-tracking branch 'upstream/master' into spacy.io 2021-08-20 14:49:51 +02:00
Sofie Van Landeghem
e1f88de729
bump to 3.1.2 (#9008) 2021-08-20 12:41:09 +02:00
Sofie Van Landeghem
4d52d7051c
Fix spancat training on nested entities (#9007)
* overfitting test on non-overlapping entities

* add failing overfitting test for overlapping entities

* failing test for list comprehension

* remove test that was put in separate PR

* bugfix

* cleanup
2021-08-20 12:37:50 +02:00
Paul O'Leary McCann
9cc3dc2b67
Add glossary entry for _SP (#8983) 2021-08-20 12:04:02 +02:00
Sofie Van Landeghem
de025beb5f
Warn and document spangroup.doc weakref (#8980)
* test for error after Doc has been garbage collected

* warn about using a SpanGroup when the Doc has been garbage collected

* add warning to the docs

* rephrase slightly

* raise error instead of warning

* update

* move warning to doc property
2021-08-20 11:06:19 +02:00
Paul O'Leary McCann
0e4da8ed70 Fix type annotation in docs 2021-08-20 15:35:41 +09:00
Paul O'Leary McCann
37fe847af4 Fix type annotation in docs 2021-08-20 15:34:22 +09:00
Ines Montani
8444aa75e2 Fix universe.json [ci skip] 2021-08-20 11:26:46 +10:00
Ines Montani
f2b61b77a5 Fix universe.json [ci skip] 2021-08-20 11:26:29 +10:00
Ines Montani
f2d19e6dc2 Merge pull request #9003 from bbieniek/add-spacy-api-v3 [ci skip] 2021-08-20 11:23:50 +10:00
Ines Montani
894e16f5ca
Merge pull request #9003 from bbieniek/add-spacy-api-v3 [ci skip] 2021-08-20 11:23:30 +10:00
Baltazar
4d85cb88a5 added contribution license 2021-08-19 21:45:18 +02:00
Baltazar
71e65fe943 added spacy api v3 docker 2021-08-19 21:29:25 +02:00
Adriane Boyd
c5de9b463a
Update custom tokenizer APIs and pickling (#8972)
* Fix incorrect pickling of Japanese and Korean pipelines, which led to
the entire pipeline being reset if pickled

* Enable pickling of Vietnamese tokenizer

* Update tokenizer APIs for Chinese, Japanese, Korean, Thai, and
Vietnamese so that only the `Vocab` is required for initialization
2021-08-19 14:37:47 +02:00
Adriane Boyd
6722dc3dc5
Fix allow_overlap default for spancat scoring (#8970)
* Remove irrelevant default options
2021-08-18 09:56:56 +02:00
Steele Farnsworth
b18cb1cd2a
Refactor dependencymatcher.pyx to use list comps and enumerate. (#8956)
* Refactor to use list comps and enumerate.

Replace loops that append to a list with a list comprehensions where this does not change the behavior; replace range(len(...)) loops with enumerate. Correct one typo in a comment. Replace a call to set() with a set literal.

* Undo double assignment.

Expand `tokens_to_key[j] = k = self._get_matcher_key(key, i, j)` to two statements.

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Sign contributors agreement

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-08-18 09:55:45 +02:00
Ines Montani
d94ddd5686
Auto-detect package dependencies in spacy package (#8948)
* Auto-detect package dependencies in spacy package

* Add simple get_third_party_dependencies test

* Import packages_distributions explicitly

* Inline packages_distributions

* Fix docstring [ci skip]

* Relax catalogue requirement

* Move importlib_metadata to spacy.compat with note

* Include license information [ci skip]
2021-08-17 14:05:13 +02:00
Sofie Van Landeghem
0a6b68848f
Fix making span_group (#8975)
* fix _make_span_group

* fix imports
2021-08-17 10:36:34 +02:00
Ines Montani
593a22cf2d
Add development docs for Language and code conventions (#8745)
* WIP: add dev docs for Language / config [ci skip]

* Add section on initialization [ci skip]

* Fix wording [ci skip]

* Add code conventions WIP [ci skip]

* Update code convention docs [ci skip]

* Update contributing guide and conventions [ci skip]

* Update Code Conventions.md [ci skip]

* Clarify sourced components + vectors

* Apply suggestions from code review [ci skip]

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update wording and add link [ci skip]

* restructure slightly + extended index

* remove paragraph that breaks flow and is repeated in more detail later

* fix anchors

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>
2021-08-17 09:38:15 +02:00
Paul O'Leary McCann
4ed5d9ad5a Add notes on preparing training data to docs (#8964)
* Add training data section

Not entirely sure this is in the right location on the page - maybe it
should be after quickstart?

* Add pointer from binary format to training data section

* Minor cleanup

* Add to ToC, fix filename

* Update website/docs/usage/training.md

Co-authored-by: Ines Montani <ines@ines.io>

* Update website/docs/usage/training.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/docs/usage/training.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Move the training data section further down the page

* Update website/docs/usage/training.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/docs/usage/training.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Run prettier

Co-authored-by: Ines Montani <ines@ines.io>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-08-16 17:39:19 +02:00
Paul O'Leary McCann
9391998c77
Add notes on preparing training data to docs (#8964)
* Add training data section

Not entirely sure this is in the right location on the page - maybe it
should be after quickstart?

* Add pointer from binary format to training data section

* Minor cleanup

* Add to ToC, fix filename

* Update website/docs/usage/training.md

Co-authored-by: Ines Montani <ines@ines.io>

* Update website/docs/usage/training.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/docs/usage/training.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Move the training data section further down the page

* Update website/docs/usage/training.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/docs/usage/training.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Run prettier

Co-authored-by: Ines Montani <ines@ines.io>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-08-16 17:37:21 +02:00
Ines Montani
d65e03adae Merge pull request #8951 from HLasse/master 2021-08-16 11:41:53 +10:00
Ines Montani
a894fe0440
Merge pull request #8951 from HLasse/master 2021-08-16 11:41:32 +10:00
Lasse
839ea0f987 change tags formatting to match 2021-08-13 14:40:08 +02:00
Lasse
70ab596f61 Merge branch 'master' of https://github.com/HLasse/spaCy 2021-08-13 14:35:21 +02:00
Lasse
195e4e48c3 add textdescriptives to universe 2021-08-13 14:35:18 +02:00
github-actions[bot]
92071326d8
Auto-format code with black (#8950)
Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>
2021-08-13 11:48:38 +02:00
Adriane Boyd
8448c7dbc5
Update da trf recommendation (#8921)
Update the da trf recommendation to the same model used in the
pretrained pipelines.
2021-08-12 13:54:02 +02:00
Ines Montani
647abe186c Merge pull request #8938 from explosion/docs/prodigy-v1-11-project [ci skip]
Update Prodigy project template for v1.11
2021-08-12 21:17:14 +10:00
Ines Montani
6260f044cc
Merge pull request #8938 from explosion/docs/prodigy-v1-11-project [ci skip]
Update Prodigy project template for v1.11
2021-08-12 21:16:49 +10:00