Adriane Boyd
27a48f2802
Fix/update extension copying in Span.as_doc and Doc.from_docs ( #7574 )
...
* Adjust custom extension data when copying user data in `Span.as_doc()`
* Restrict `Doc.from_docs()` to adjusting offsets for custom extension
data
* Update test to use extension
* (Duplicate bug fix for character offset from #7497 )
2021-03-30 09:49:12 +02:00
Santiago Castro
af07fc3bc1
Add support for CUDA 11.2 ( #7583 )
...
* Add support for CUDA 11.2
* Update the docs
* Format
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2021-03-30 09:47:33 +02:00
Álvaro Abella Bascarán
5b4dde38a3
fix fn name: tokenizer.infixes_finditer -> tokenizer.infix_finditer ( #7606 )
2021-03-30 09:45:49 +02:00
Adriane Boyd
3ae8661085
Fix tensor retokenization for non-numpy ops ( #7527 )
...
Implement manual `append` and `delete` for non-numpy ops.
2021-03-29 22:34:48 +11:00
Adriane Boyd
139f655f34
Merge doc.spans in Doc.from_docs() ( #7497 )
...
Merge data from `doc.spans` in `Doc.from_docs()`.
* Fix internal character offset set when merging empty docs (only
affects tokens and spans in `user_data` if an empty doc is in the list
of docs)
2021-03-29 22:34:01 +11:00
Adriane Boyd
d59f968d08
Keep sent starts without parse in retokenization ( #7424 )
...
In the retokenizer, only reset sent starts (with
`set_children_from_head`) if the doc is parsed. If there is no parse,
merged tokens have the unset `token.is_sent_start == None` by default after
retokenization.
2021-03-29 22:32:00 +11:00
Paul O'Leary McCann
faed54d659
Merge pull request #7537 from polm/docs/patience-negative
...
Remove mention of -1 for early stopping (fix #7535 )
2021-03-26 21:11:53 +09:00
Paul O'Leary McCann
cdab341a75
Remove mention of -1 for early stopping ( fix #7535 )
...
Maybe this used to work differently, but currently a negative patience
just causes immediate termination.
2021-03-23 11:50:35 +09:00
Ines Montani
4bd3d01aaf
Merge pull request #7471 from polm/fix/listener-warnings
2021-03-22 12:45:02 +01:00
Ines Montani
d545ab4ca4
Merge pull request #7495 from adrianeboyd/bugfix/norm-ux
...
Update lexeme_norm checks
2021-03-22 12:44:52 +01:00
Ines Montani
be55f43163
Merge pull request #7473 from adrianeboyd/docs/v3-pipeline-deps-order
2021-03-22 12:43:07 +01:00
Ines Montani
3ee2fcfba0
Merge pull request #7483 from adrianeboyd/docs/various-v3-4 [ci skip]
2021-03-22 12:37:06 +01:00
Ines Montani
88e5a0dc16
Merge pull request #7504 from polm/fix/lexeme-docs [ci skip]
...
Fix mismatched backtick in Lexeme docs
2021-03-22 12:36:44 +01:00
Ines Montani
66ebd5c69e
Merge pull request #7491 from adrianeboyd/bugfix/corpus-depr-props
...
Update deprecated doc.is_sentenced in Corpus
2021-03-21 02:17:24 +01:00
Ines Montani
e3c3dbdb15
Merge pull request #7492 from adrianeboyd/bugfix/ux-matcher-attributes
...
Update matcher errors and docs
2021-03-21 02:17:13 +01:00
Adriane Boyd
0d2b723e8d
Update entity setting section
2021-03-20 11:38:55 +01:00
Paul O'Leary McCann
e39c0dcf33
Fix mismatched backtick in Lexeme docs
2021-03-20 18:40:00 +09:00
Adriane Boyd
39153ef90f
Update lexeme_norm checks
...
* Add util method for check
* Add new languages to list with lexeme norm tables
* Add check to all relevant components
* Add config details to warning message
Note that we're not actually inspecting the model config to see if
`NORM` is used as an attribute, so it may warn in cases where it's not
relevant.
2021-03-19 10:59:27 +01:00
Adriane Boyd
c771ec22f0
Update matcher errors and docs
...
* Mention `tagger+attribute_ruler` in `POS`/`MORPH` error messages for
`Matcher` and `PhraseMatcher`
* Document `Matcher.__call__(allow_missing=)`
2021-03-19 10:11:18 +01:00
Adriane Boyd
48b90c8e1c
Update deprecated doc.is_sentenced in Corpus
2021-03-19 09:43:52 +01:00
Adriane Boyd
6a9a467766
Update website/docs/usage/processing-pipelines.md
...
Co-authored-by: Ines Montani <ines@ines.io>
2021-03-19 08:12:49 +01:00
Ines Montani
34e13c1161
Merge pull request #7472 from erre-quadro/universe/spikex
...
Add SpikeX to spaCy universe
2021-03-19 02:08:36 +01:00
Ines Montani
4f9aaa2366
Merge pull request #7451 from adrianeboyd/chore/add-py.typed
...
Add py.typed
2021-03-19 02:08:16 +01:00
Ines Montani
66b900a76d
Merge pull request #7440 from adrianeboyd/bugfix/ru-pymorph2-lookup-lemmatize
...
Rename and update Russian pymorphy2 lookup lemmatize
2021-03-19 01:54:08 +01:00
Ines Montani
2c6fa8c890
Merge pull request #7489 from adrianeboyd/bugfix/callbacks-entry-points
...
Check for callbacks entry points
2021-03-19 01:53:53 +01:00
Ines Montani
b878bc74b9
Merge pull request #7488 from Findus23/no-is-not
...
replace "is not" with !=
2021-03-19 01:53:38 +01:00
Adriane Boyd
0ad9e16ec3
Check for callbacks entry points
2021-03-18 21:18:25 +01:00
Lukas Winkler
3c362ac520
replace "is not" with !=
2021-03-18 21:09:11 +01:00
Adriane Boyd
6354b642c5
Fix typo
2021-03-18 19:01:10 +01:00
Adriane Boyd
40e5d3a980
Update saving/loading example
2021-03-18 16:56:10 +01:00
Adriane Boyd
0fb1881f36
Reformat processing pipelines
2021-03-18 13:31:42 +01:00
Adriane Boyd
acc58719da
Update custom similarity hooks example
2021-03-18 13:31:42 +01:00
Adriane Boyd
c9e1a9ac17
Add multiprocessing section
2021-03-18 13:31:42 +01:00
Adriane Boyd
9a254d3995
Include all en_core_web_sm components in examples
2021-03-18 13:31:42 +01:00
Adriane Boyd
83c1b919a7
Fix positional/option in CLI types
2021-03-18 13:31:42 +01:00
Adriane Boyd
9fd41d6742
Remove Language.pipe cleanup arg
2021-03-18 13:31:42 +01:00
Paul O'Leary McCann
40bc01e668
Proactively remove unused listeners
...
With this the changes in initialize.py might be unecessary.
Requires testing.
2021-03-17 22:41:41 +09:00
Adriane Boyd
5da323fd86
Minor edits
2021-03-17 12:59:05 +01:00
Adriane Boyd
a5ffe8dfed
Add details about pretrained pipeline design
2021-03-17 11:31:26 +01:00
Paul O'Leary McCann
ef77c88638
Don't warn about components not in the pipeline
...
See here:
https://github.com/explosion/spaCy/discussions/7463
Still need to check if there are any side effects of listeners being
present but not in the pipeline, but this commit will silence the
warnings.
2021-03-17 14:56:04 +09:00
Paolo Arduin
00e59be966
Add SpikeX to spaCy universe
2021-03-16 18:22:03 +01:00
Adriane Boyd
02b5c8a1a2
Add py.typed
2021-03-16 09:48:31 +01:00
Adriane Boyd
3bcf74aca7
Rename and update ru pymorphy2 lookup lemmatize
...
* To allow default lookup lemmatization with a blank Russian model,
rename pymorphy2 lookup mode to `pymorphy2_lookup`
* Bug fix: update pymorphy2 lookup lemmatize to return list rather than
string
2021-03-15 11:11:06 +01:00
bsweileh
61472e7cb3
Update _training.md - Fix broken link on backpropagation ( #7431 )
...
* Update _training.md
Fix broken link on backpropagation
* Add agreement
add spacy contributor agreement
2021-03-15 09:21:35 +01:00
Ines Montani
be44257cab
Merge pull request #7418 from adrianeboyd/docs/examples-readme
...
Add examples README
2021-03-13 04:28:07 +01:00
Ines Montani
c67d5a6eb0
Merge pull request #7394 from adrianeboyd/docs/ner-example-data-readme
2021-03-13 04:26:18 +01:00
Ines Montani
068b97a617
Merge pull request #7408 from adrianeboyd/bugfix/load-keyword-only
2021-03-13 04:25:50 +01:00
Ines Montani
3466a11e72
Merge pull request #7421 from adrianeboyd/bugfix/cli-code-arg
2021-03-13 04:25:17 +01:00
Adriane Boyd
3168103605
Fix type of spacy train --output in docs
2021-03-12 10:04:57 +01:00
Adriane Boyd
03e9e7b567
Add --code option to init fill-config
2021-03-12 10:03:57 +01:00