spaCy

mirror of https://github.com/explosion/spaCy.git synced 2024-11-11 20:28:20 +03:00

Author	SHA1	Message	Date
Adriane Boyd	e8357923ec	Various install docs updates (#10487 ) * Simplify quickstart source install to use only editable pip install * Update pytorch install instructions to more recent versions	2022-03-15 11:12:50 +01:00
Paul O'Leary McCann	f3981bd0c8	Clarify how to fill in init_tok2vec after pretraining (#9639 ) * Clarify how to fill in init_tok2vec after pretraining * Ignore init_tok2vec arg in pretraining * Update docs, config setting * Remove obsolete note about not filling init_tok2vec early This seems to have also caught some lines that needed cleanup.	2021-11-18 15:38:30 +01:00
Elia Robyn Lake (Robyn Speer)	fa70837f28	clarify how to connect pretraining to training (#9450 ) * clarify how to connect pretraining to training Signed-off-by: Elia Robyn Speer <elia@explosion.ai> * Update website/docs/usage/embeddings-transformers.md * Update website/docs/usage/embeddings-transformers.md * Update website/docs/usage/embeddings-transformers.md * Update website/docs/usage/embeddings-transformers.md Co-authored-by: Elia Robyn Speer <elia@explosion.ai> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-10-22 13:15:47 +02:00
Paul O'Leary McCann	222cf9b6d2	Clarify how to change base Transformer model (#9498 ) * Add note about how the model name is used * Add link to TransformersModel docs, separate paragraph * Local link * Revise docs * Update website/docs/usage/embeddings-transformers.md * Update website/docs/usage/embeddings-transformers.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-10-19 23:28:20 +02:00
Sofie Van Landeghem	3fd3531e12	Docs for new spacy-trf architectures (#8954 ) * use TransformerModel.v2 in quickstart * update docs for new transformer architectures * bump spacy_transformers to 1.1.0 * Add new arguments spacy-transformers.TransformerModel.v3 * Mention that mixed-precision support is experimental * Describe delta transformers.Tok2VecTransformer versions * add dot * add dot, again * Update some more TransformerModel references v2 -> v3 * Add mixed-precision options to the training quickstart Disable mixed-precision training/prediction by default. * Update setup.cfg Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Apply suggestions from code review Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update website/docs/usage/embeddings-transformers.md Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> Co-authored-by: Daniël de Kok <me@danieldk.eu> Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2021-10-18 14:15:06 +02:00
Paul O'Leary McCann	1d57d78758	Make docs consistent (fix #9126 )	2021-09-16 15:54:12 +09:00
Paul O'Leary McCann	66bfabd839	Fix pretraining objectives fragment (#8005 ) * Fix pretraining objectives fragment The fragment here is reused from a heading higher up, so you couldn't link to this section. * Fix section link to new fragment	2021-05-06 08:27:36 +02:00
Adriane Boyd	d2bdaa7823	Replace negative rows with 0 in StaticVectors (#7674 ) * Replace negative rows with 0 in StaticVectors Replace negative row indices with 0-vectors in `StaticVectors`. * Increase versions related to StaticVectors * Increase versions of all architctures and layers related to `StaticVectors` * Improve efficiency of 0-vector operations Parallel `spacy-legacy` PR: https://github.com/explosion/spacy-legacy/pull/5 * Update config defaults to new versions * Update docs	2021-04-22 18:04:15 +10:00
Adriane Boyd	8b76cb8095	Rephrase transformers PyTorch instructions	2021-01-29 13:36:56 +01:00
Adriane Boyd	e3e87e7275	Update transfomers install docs * Recommend installing PyTorch separately * Add instructions for `sentencepiece`	2021-01-29 13:27:43 +01:00
Adriane Boyd	61c9f8bf24	Remove transformers model max length section (#6807 )	2021-01-25 19:59:34 +08:00
Adriane Boyd	bf0cdae8d4	Add token_splitter component (#6726 ) * Add long_token_splitter component Add a `long_token_splitter` component for use with transformer pipelines. This component splits up long tokens like URLs into smaller tokens. This is particularly relevant for pretrained pipelines with `strided_spans`, since the user can't change the length of the span `window` and may not wish to preprocess the input texts. The `long_token_splitter` splits tokens that are at least `long_token_length` tokens long into smaller tokens of `split_length` size. Notes: * Since this is intended for use as the first component in a pipeline, the token splitter does not try to preserve any token annotation. * API docs to come when the API is stable. * Adjust API, add test * Fix name in factory	2021-01-17 19:54:41 +08:00
Sofie Van Landeghem	75d9019343	Fix types of Tok2Vec encoding architectures (#6442 ) * fix TorchBiLSTMEncoder documentation * ensure the types of the encoding Tok2vec layers are correct * update references from v1 to v2 for the new architectures	2021-01-07 16:39:27 +11:00
Sofie Van Landeghem	82ae95267a	Docs for pretrain architectures (#6605 ) * document pretraining architectures * formatting * bit more info * small fixes	2021-01-06 16:12:30 +11:00
Ines Montani	c968d1560f	Fix docs example [ci skip]	2020-10-16 11:33:20 +02:00
Ines Montani	ba1e004049	Fix typo [ci skip]	2020-10-15 23:39:04 +02:00
svlandeg	08cb085f6c	Merge remote-tracking branch 'upstream/develop' into fix/various	2020-10-09 17:01:27 +02:00
Ines Montani	9fb3244672	Merge pull request #6231 from adrianeboyd/feature/include-static-vectors	2020-10-09 15:54:52 +02:00
Adriane Boyd	2dd79454af	Update docs	2020-10-09 14:42:07 +02:00
svlandeg	853edace37	fix MultiHashEmbed example in documentation	2020-10-09 14:11:06 +02:00
Ines Montani	e50dc2c1c9	Update docs [ci skip]	2020-10-09 12:04:52 +02:00
Ines Montani	d1602e1ece	Update docs [ci skip]	2020-10-08 11:56:50 +02:00
Ines Montani	43e59bb22a	Update docs and install extras [ci skip]	2020-10-08 10:58:50 +02:00
Ines Montani	01c1538c72	Integrate file readers	2020-10-02 01:36:06 +02:00
Sofie Van Landeghem	a22215f427	Add FeatureExtractor from Thinc (#6170 ) * move featureextractor from Thinc * Update website/docs/api/architectures.md Co-authored-by: Ines Montani <ines@ines.io> * Update website/docs/api/architectures.md Co-authored-by: Ines Montani <ines@ines.io> Co-authored-by: Ines Montani <ines@ines.io>	2020-10-01 16:22:48 +02:00
Ines Montani	0a8a124a6e	Update docs [ci skip]	2020-10-01 12:15:53 +02:00
Ines Montani	361f91e286	Merge pull request #6135 from walterhenry/develop-proof	2020-09-29 20:49:06 +02:00
walterhenry	1d80b3dc1b	Proofreading Finished with the API docs and started on the Usage, but Embedding & Transformers	2020-09-29 12:39:10 +02:00
Sofie Van Landeghem	009ba14aaf	Fix pretraining in train script (#6143 ) * update pretraining API in train CLI * bump thinc to 8.0.0a35 * bump to 3.0.0a26 * doc fixes * small doc fix	2020-09-25 15:47:10 +02:00
Ines Montani	6836b66433	Update docs and resolve todos [ci skip]	2020-09-24 13:41:25 +02:00
svlandeg	6c85fab316	state_type and extra_state_tokens instead of nr_feature_tokens	2020-09-23 13:35:09 +02:00
Ines Montani	012b3a7096	Update docs [ci skip]	2020-09-20 17:44:58 +02:00
Ines Montani	554c9a2497	Update docs [ci skip]	2020-09-20 12:30:53 +02:00
Ines Montani	a0b4389a38	Update docs [ci skip]	2020-09-17 19:24:48 +02:00
Matthew Honnibal	6efb7688a6	Draft pretrain usage	2020-09-17 18:17:03 +02:00
Ines Montani	a2c8cda26f	Update docs [ci skip]	2020-09-17 17:12:51 +02:00
Matthew Honnibal	ec751068f3	Draft text for static vectors intro	2020-09-17 16:42:53 +02:00
Ines Montani	8b0dabe987	Update docs [ci skip]	2020-09-12 17:05:10 +02:00
Sofie Van Landeghem	8e7557656f	Renaming gold & annotation_setter (#6042 ) * version bump to 3.0.0a16 * rename "gold" folder to "training" * rename 'annotation_setter' to 'set_extra_annotations' * formatting	2020-09-09 10:31:03 +02:00
Ines Montani	23b7d9cfa3	Prefix span getters	2020-09-03 17:37:06 +02:00
Ines Montani	690bd77669	Add todos [ci skip]	2020-09-01 14:04:36 +02:00
svlandeg	e47ea88aeb	revert annotations refactor	2020-08-31 14:40:55 +02:00
svlandeg	c18eb63483	Merge remote-tracking branch 'upstream/develop' into feature/vectors-docs # Conflicts: # website/docs/usage/embeddings-transformers.md	2020-08-31 13:21:36 +02:00
Sofie Van Landeghem	ec14744ee4	Rename Transformer listener (#6001 ) * rename to spacy-transformers.TransformerListener * add some more tok2vec tests * use select_pipes * fix docs - annotation setter was not changed in the end	2020-08-31 12:41:39 +02:00
Ines Montani	bc0730be3f	Update docs [ci skip]	2020-08-29 12:53:14 +02:00
svlandeg	9f00a20ce4	proofreading and custom examples	2020-08-28 21:50:42 +02:00
svlandeg	556e975a30	various fixes	2020-08-27 19:24:44 +02:00
svlandeg	329e490560	small import fixes	2020-08-27 14:50:43 +02:00
svlandeg	28e4ba7270	fix references to TransformerListener	2020-08-27 14:33:28 +02:00
svlandeg	4d37ac3f33	configure_custom_sent_spans example	2020-08-27 14:14:16 +02:00

1 2

67 Commits