Adriane Boyd
e8357923ec
Various install docs updates ( #10487 )
...
* Simplify quickstart source install to use only editable pip install
* Update pytorch install instructions to more recent versions
2022-03-15 11:12:50 +01:00
Paul O'Leary McCann
f3981bd0c8
Clarify how to fill in init_tok2vec after pretraining ( #9639 )
...
* Clarify how to fill in init_tok2vec after pretraining
* Ignore init_tok2vec arg in pretraining
* Update docs, config setting
* Remove obsolete note about not filling init_tok2vec early
This seems to have also caught some lines that needed cleanup.
2021-11-18 15:38:30 +01:00
Elia Robyn Lake (Robyn Speer)
fa70837f28
clarify how to connect pretraining to training ( #9450 )
...
* clarify how to connect pretraining to training
Signed-off-by: Elia Robyn Speer <elia@explosion.ai>
* Update website/docs/usage/embeddings-transformers.md
* Update website/docs/usage/embeddings-transformers.md
* Update website/docs/usage/embeddings-transformers.md
* Update website/docs/usage/embeddings-transformers.md
Co-authored-by: Elia Robyn Speer <elia@explosion.ai>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-10-22 13:15:47 +02:00
Paul O'Leary McCann
222cf9b6d2
Clarify how to change base Transformer model ( #9498 )
...
* Add note about how the model name is used
* Add link to TransformersModel docs, separate paragraph
* Local link
* Revise docs
* Update website/docs/usage/embeddings-transformers.md
* Update website/docs/usage/embeddings-transformers.md
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-10-19 23:28:20 +02:00
Sofie Van Landeghem
3fd3531e12
Docs for new spacy-trf architectures ( #8954 )
...
* use TransformerModel.v2 in quickstart
* update docs for new transformer architectures
* bump spacy_transformers to 1.1.0
* Add new arguments spacy-transformers.TransformerModel.v3
* Mention that mixed-precision support is experimental
* Describe delta transformers.Tok2VecTransformer versions
* add dot
* add dot, again
* Update some more TransformerModel references v2 -> v3
* Add mixed-precision options to the training quickstart
Disable mixed-precision training/prediction by default.
* Update setup.cfg
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Apply suggestions from code review
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Update website/docs/usage/embeddings-transformers.md
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Daniël de Kok <me@danieldk.eu>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2021-10-18 14:15:06 +02:00
Paul O'Leary McCann
1d57d78758
Make docs consistent ( fix #9126 )
2021-09-16 15:54:12 +09:00
Paul O'Leary McCann
66bfabd839
Fix pretraining objectives fragment ( #8005 )
...
* Fix pretraining objectives fragment
The fragment here is reused from a heading higher up, so you couldn't
link to this section.
* Fix section link to new fragment
2021-05-06 08:27:36 +02:00
Adriane Boyd
d2bdaa7823
Replace negative rows with 0 in StaticVectors ( #7674 )
...
* Replace negative rows with 0 in StaticVectors
Replace negative row indices with 0-vectors in `StaticVectors`.
* Increase versions related to StaticVectors
* Increase versions of all architctures and layers related to
`StaticVectors`
* Improve efficiency of 0-vector operations
Parallel `spacy-legacy` PR: https://github.com/explosion/spacy-legacy/pull/5
* Update config defaults to new versions
* Update docs
2021-04-22 18:04:15 +10:00
Adriane Boyd
8b76cb8095
Rephrase transformers PyTorch instructions
2021-01-29 13:36:56 +01:00
Adriane Boyd
e3e87e7275
Update transfomers install docs
...
* Recommend installing PyTorch separately
* Add instructions for `sentencepiece`
2021-01-29 13:27:43 +01:00
Adriane Boyd
61c9f8bf24
Remove transformers model max length section ( #6807 )
2021-01-25 19:59:34 +08:00
Adriane Boyd
bf0cdae8d4
Add token_splitter component ( #6726 )
...
* Add long_token_splitter component
Add a `long_token_splitter` component for use with transformer
pipelines. This component splits up long tokens like URLs into smaller
tokens. This is particularly relevant for pretrained pipelines with
`strided_spans`, since the user can't change the length of the span
`window` and may not wish to preprocess the input texts.
The `long_token_splitter` splits tokens that are at least
`long_token_length` tokens long into smaller tokens of `split_length`
size.
Notes:
* Since this is intended for use as the first component in a pipeline,
the token splitter does not try to preserve any token annotation.
* API docs to come when the API is stable.
* Adjust API, add test
* Fix name in factory
2021-01-17 19:54:41 +08:00
Sofie Van Landeghem
75d9019343
Fix types of Tok2Vec encoding architectures ( #6442 )
...
* fix TorchBiLSTMEncoder documentation
* ensure the types of the encoding Tok2vec layers are correct
* update references from v1 to v2 for the new architectures
2021-01-07 16:39:27 +11:00
Sofie Van Landeghem
82ae95267a
Docs for pretrain architectures ( #6605 )
...
* document pretraining architectures
* formatting
* bit more info
* small fixes
2021-01-06 16:12:30 +11:00
Ines Montani
c968d1560f
Fix docs example [ci skip]
2020-10-16 11:33:20 +02:00
Ines Montani
ba1e004049
Fix typo [ci skip]
2020-10-15 23:39:04 +02:00
svlandeg
08cb085f6c
Merge remote-tracking branch 'upstream/develop' into fix/various
2020-10-09 17:01:27 +02:00
Ines Montani
9fb3244672
Merge pull request #6231 from adrianeboyd/feature/include-static-vectors
2020-10-09 15:54:52 +02:00
Adriane Boyd
2dd79454af
Update docs
2020-10-09 14:42:07 +02:00
svlandeg
853edace37
fix MultiHashEmbed example in documentation
2020-10-09 14:11:06 +02:00
Ines Montani
e50dc2c1c9
Update docs [ci skip]
2020-10-09 12:04:52 +02:00
Ines Montani
d1602e1ece
Update docs [ci skip]
2020-10-08 11:56:50 +02:00
Ines Montani
43e59bb22a
Update docs and install extras [ci skip]
2020-10-08 10:58:50 +02:00
Ines Montani
01c1538c72
Integrate file readers
2020-10-02 01:36:06 +02:00
Sofie Van Landeghem
a22215f427
Add FeatureExtractor from Thinc ( #6170 )
...
* move featureextractor from Thinc
* Update website/docs/api/architectures.md
Co-authored-by: Ines Montani <ines@ines.io>
* Update website/docs/api/architectures.md
Co-authored-by: Ines Montani <ines@ines.io>
Co-authored-by: Ines Montani <ines@ines.io>
2020-10-01 16:22:48 +02:00
Ines Montani
0a8a124a6e
Update docs [ci skip]
2020-10-01 12:15:53 +02:00
Ines Montani
361f91e286
Merge pull request #6135 from walterhenry/develop-proof
2020-09-29 20:49:06 +02:00
walterhenry
1d80b3dc1b
Proofreading
...
Finished with the API docs and started on the Usage, but Embedding & Transformers
2020-09-29 12:39:10 +02:00
Sofie Van Landeghem
009ba14aaf
Fix pretraining in train script ( #6143 )
...
* update pretraining API in train CLI
* bump thinc to 8.0.0a35
* bump to 3.0.0a26
* doc fixes
* small doc fix
2020-09-25 15:47:10 +02:00
Ines Montani
6836b66433
Update docs and resolve todos [ci skip]
2020-09-24 13:41:25 +02:00
svlandeg
6c85fab316
state_type and extra_state_tokens instead of nr_feature_tokens
2020-09-23 13:35:09 +02:00
Ines Montani
012b3a7096
Update docs [ci skip]
2020-09-20 17:44:58 +02:00
Ines Montani
554c9a2497
Update docs [ci skip]
2020-09-20 12:30:53 +02:00
Ines Montani
a0b4389a38
Update docs [ci skip]
2020-09-17 19:24:48 +02:00
Matthew Honnibal
6efb7688a6
Draft pretrain usage
2020-09-17 18:17:03 +02:00
Ines Montani
a2c8cda26f
Update docs [ci skip]
2020-09-17 17:12:51 +02:00
Matthew Honnibal
ec751068f3
Draft text for static vectors intro
2020-09-17 16:42:53 +02:00
Ines Montani
8b0dabe987
Update docs [ci skip]
2020-09-12 17:05:10 +02:00
Sofie Van Landeghem
8e7557656f
Renaming gold & annotation_setter ( #6042 )
...
* version bump to 3.0.0a16
* rename "gold" folder to "training"
* rename 'annotation_setter' to 'set_extra_annotations'
* formatting
2020-09-09 10:31:03 +02:00
Ines Montani
23b7d9cfa3
Prefix span getters
2020-09-03 17:37:06 +02:00
Ines Montani
690bd77669
Add todos [ci skip]
2020-09-01 14:04:36 +02:00
svlandeg
e47ea88aeb
revert annotations refactor
2020-08-31 14:40:55 +02:00
svlandeg
c18eb63483
Merge remote-tracking branch 'upstream/develop' into feature/vectors-docs
...
# Conflicts:
# website/docs/usage/embeddings-transformers.md
2020-08-31 13:21:36 +02:00
Sofie Van Landeghem
ec14744ee4
Rename Transformer listener ( #6001 )
...
* rename to spacy-transformers.TransformerListener
* add some more tok2vec tests
* use select_pipes
* fix docs - annotation setter was not changed in the end
2020-08-31 12:41:39 +02:00
Ines Montani
bc0730be3f
Update docs [ci skip]
2020-08-29 12:53:14 +02:00
svlandeg
9f00a20ce4
proofreading and custom examples
2020-08-28 21:50:42 +02:00
svlandeg
556e975a30
various fixes
2020-08-27 19:24:44 +02:00
svlandeg
329e490560
small import fixes
2020-08-27 14:50:43 +02:00
svlandeg
28e4ba7270
fix references to TransformerListener
2020-08-27 14:33:28 +02:00
svlandeg
4d37ac3f33
configure_custom_sent_spans example
2020-08-27 14:14:16 +02:00