Commit Graph

2472 Commits

Author SHA1 Message Date
Ines Montani
a8a1231ccd Update README and docs [ci skip] 2021-01-31 12:36:04 +11:00
Ines Montani
45c551037d Update CLI docs [ci skip] 2021-01-30 21:50:23 +11:00
Ines Montani
ae07416fda Merge branch 'website/v3-launch' into develop 2021-01-30 20:31:06 +11:00
Ines Montani
d07683873f Merge branch 'master' into develop 2021-01-30 20:28:14 +11:00
Ines Montani
8626b82e49 Update images [ci skip] 2021-01-30 18:50:25 +11:00
Ines Montani
44dc987d85 Fix icon [ci skip] 2021-01-30 18:27:55 +11:00
Ines Montani
8d293a4c4b Update website to support legacy state [ci skip] 2021-01-30 18:27:31 +11:00
Ines Montani
d3350afe45 Update docs and add support for legacy style 2021-01-30 17:43:12 +11:00
Ines Montani
2332c4280b Update and use unified --build option 2021-01-30 13:11:36 +11:00
Ines Montani
2609ba4e89 Support building wheel in spacy package 2021-01-30 11:54:02 +11:00
Ines Montani
95e958a229
Merge pull request #6852 from explosion/feature/replace-listeners 2021-01-30 00:58:08 +11:00
Ines Montani
7694f76dd1 Update warning and mention replace_listeners 2021-01-29 23:46:01 +11:00
Adriane Boyd
8b76cb8095 Rephrase transformers PyTorch instructions 2021-01-29 13:36:56 +01:00
Ines Montani
095055ac48
Merge pull request #6855 from adrianeboyd/docs/trf-sentencepiece [ci skip]
Update transfomers install docs
2021-01-29 23:34:01 +11:00
Adriane Boyd
e3e87e7275 Update transfomers install docs
* Recommend installing PyTorch separately
* Add instructions for `sentencepiece`
2021-01-29 13:27:43 +01:00
Ines Montani
e766e8c56d
Apply suggestions from code review
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-01-29 21:41:17 +11:00
svlandeg
d7d838281c adding new="3" mentions in the doc 2021-01-29 11:26:37 +01:00
Ines Montani
99af9e7125 Update documentation 2021-01-29 18:45:48 +11:00
Sofie Van Landeghem
24a697abb8
avoid empty aliases and improve UX and docs (#6840) 2021-01-29 08:51:40 +08:00
Sofie Van Landeghem
837a4f53c2
Error handling in nlp.pipe (#6817)
* add error handler for pipe methods

* add unit tests

* remove pipe method that are the same as their base class

* have Language keep track of a default error handler

* cleanup

* formatting

* small refactor

* add documentation
2021-01-29 08:51:21 +08:00
Ines Montani
ec5f55aa5b
Update config generation defaults and transformers (#6832) 2021-01-27 23:56:33 +11:00
Adriane Boyd
4096a79de7
Add alignment mode error and fix Doc.char_span docs (#6820)
* Raise an error on an unrecognized alignment mode rather than
defaulting to `strict`
* Fix the `Doc.char_span` API doc alignment mode details
2021-01-27 23:40:42 +11:00
Ines Montani
230e651ad6 Merge branch 'develop' into master-tmp 2021-01-27 13:26:29 +11:00
Ines Montani
634ae609b4 Adjust formatting [ci skip] 2021-01-27 13:08:00 +11:00
Ines Montani
d5ef245bb1
Merge pull request #6822 from jganseman/master [ci skip] 2021-01-27 13:04:30 +11:00
Ines Montani
5d79d1af50
Merge pull request #6796 from svlandeg/docs/benchmarks [ci skip] 2021-01-27 13:01:23 +11:00
Ines Montani
1ed7029d47 Update website for v3 launch 2021-01-27 12:39:47 +11:00
Adriane Boyd
c447aa2b98 Update --code arg in evaluate CLI docs 2021-01-26 15:30:46 +01:00
jganseman
907bce7a78
Merge pull request #1 from jganseman/patch-1
Patch 1
2021-01-26 11:12:30 +01:00
jganseman
8bc57ec372
also update is_oov in lexeme docs 2021-01-26 11:09:16 +01:00
jganseman
1f2b0ec168
proposing a more concise explanation for is_oov
proposing a more concise explanation for is_oov
2021-01-26 10:53:39 +01:00
Matthew Honnibal
f049df1715
Revert "Set annotations in update" (#6810)
* Revert "Set annotations in update (#6767)"

This reverts commit e680efc7cc.

* Fix version

* Update spacy/pipeline/entity_linker.py

* Update spacy/pipeline/entity_linker.py

* Update spacy/pipeline/tagger.pyx

* Update spacy/pipeline/tok2vec.py

* Update spacy/pipeline/tok2vec.py

* Update spacy/pipeline/transition_parser.pyx

* Update spacy/pipeline/transition_parser.pyx

* Update website/docs/api/multilabel_textcategorizer.md

* Update website/docs/api/tok2vec.md

* Update website/docs/usage/layers-architectures.md

* Update website/docs/usage/layers-architectures.md

* Update website/docs/api/transformer.md

* Update website/docs/api/textcategorizer.md

* Update website/docs/api/tagger.md

* Update spacy/pipeline/entity_linker.py

* Update website/docs/api/sentencerecognizer.md

* Update website/docs/api/pipe.md

* Update website/docs/api/morphologizer.md

* Update website/docs/api/entityrecognizer.md

* Update spacy/pipeline/entity_linker.py

* Update spacy/pipeline/multitask.pyx

* Update spacy/pipeline/tagger.pyx

* Update spacy/pipeline/tagger.pyx

* Update spacy/pipeline/textcat.py

* Update spacy/pipeline/textcat.py

* Update spacy/pipeline/textcat.py

* Update spacy/pipeline/tok2vec.py

* Update spacy/pipeline/trainable_pipe.pyx

* Update spacy/pipeline/trainable_pipe.pyx

* Update spacy/pipeline/transition_parser.pyx

* Update spacy/pipeline/transition_parser.pyx

* Update website/docs/api/entitylinker.md

* Update website/docs/api/dependencyparser.md

* Update spacy/pipeline/trainable_pipe.pyx
2021-01-25 22:18:45 +08:00
Adriane Boyd
61c9f8bf24
Remove transformers model max length section (#6807) 2021-01-25 19:59:34 +08:00
muratjumashev
7d0154a36e Added language meta data 2021-01-25 00:42:19 +06:00
svlandeg
56064faed9 update caption 2021-01-23 00:57:00 +01:00
svlandeg
d7c0f40a96 update comment 2021-01-22 18:55:18 +01:00
svlandeg
a071279bc7 add speed comparison to docs 2021-01-22 18:46:35 +01:00
svlandeg
b132cb3036 update accuracies for new a1 models 2021-01-21 20:24:05 +01:00
Adriane Boyd
d0236136a2
Fix default config init in Transformer API docs (#6781) 2021-01-21 23:18:03 +08:00
Sofie Van Landeghem
e680efc7cc
Set annotations in update (#6767)
* bump to 3.0.0rc4

* do set_annotations in component update calls

* update docs and remove set_annotations flag

* fix EL test
2021-01-20 11:49:25 +11:00
Sofie Van Landeghem
57640aa838
warn when frozen components break listener pattern (#6766)
* warn when frozen components break listener pattern

* few notes in the documentation

* update arg name

* formatting

* cleanup

* specify listeners return type
2021-01-20 11:12:35 +11:00
Ines Montani
4a1029a9b6 Add infobox [ci skip] 2021-01-19 19:18:39 +11:00
Adriane Boyd
7cd5c9e098 Add xx_sent_ud_sm model to website 2021-01-19 09:02:35 +01:00
Ines Montani
76e25afcd7
Merge pull request #6757 from adrianeboyd/docs/mk-ru-langs [ci skip]
Update languages for website
2021-01-19 11:10:48 +11:00
Ines Montani
f50502dad7 Update docs [ci skip] 2021-01-19 00:22:47 +11:00
Adriane Boyd
e8f6400923 Update languages for website
* Add Macedonian
* Add Russian dependencies
* Switch Chinese dependency to spacy-pkuseg
2021-01-18 14:09:34 +01:00
Ines Montani
2ae8dfbb93 Fix website [ci skip] 2021-01-18 22:31:32 +11:00
Ines Montani
09cacbb7ee Fix website [ci skip] 2021-01-18 11:37:04 +11:00
Sofie Van Landeghem
fed8f48965
raise NotImplementedError when noun_chunks iterator is not implemented (#6711)
* raise NotImplementedError when noun_chunks iterator is not implemented

* bring back, fix and document span.noun_chunks

* formatting

Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2021-01-17 19:56:05 +08:00
Adriane Boyd
bf0cdae8d4
Add token_splitter component (#6726)
* Add long_token_splitter component

Add a `long_token_splitter` component for use with transformer
pipelines. This component splits up long tokens like URLs into smaller
tokens. This is particularly relevant for pretrained pipelines with
`strided_spans`, since the user can't change the length of the span
`window` and may not wish to preprocess the input texts.

The `long_token_splitter` splits tokens that are at least
`long_token_length` tokens long into smaller tokens of `split_length`
size.

Notes:

* Since this is intended for use as the first component in a pipeline,
the token splitter does not try to preserve any token annotation.
* API docs to come when the API is stable.

* Adjust API, add test

* Fix name in factory
2021-01-17 19:54:41 +08:00