Ines Montani
f78b91c03b
Update links [ci skip]
2023-12-11 15:51:01 +01:00
Daniël de Kok
42fe4edfd7
Add distillation tests with max cut size
...
And fix endless loop when the max cut size is 0 or 1.
2023-12-08 20:38:01 +01:00
Daniël de Kok
e2591cda36
isort
2023-12-08 20:24:09 +01:00
Daniël de Kok
e5ec45cb7e
Revert "Merge the parser refactor into v4
( #10940 )"
...
This reverts commit a183db3cef
.
2023-12-08 20:23:08 +01:00
Daniël de Kok
05803cfe76
Revert "Reimplement distillation with oracle cut size ( #12214 )"
...
This reverts commit e27c60a702
.
2023-12-08 14:38:05 +01:00
Raphael Mitsch
9fcd2bfa08
Add info on endpoint arg. ( #13169 )
2023-12-05 12:46:29 +01:00
Raphael Mitsch
a25a3b996b
Merge pull request #13173 from explosion/docs/llm_main
...
Sync `llm_develop` with `llm_main`
2023-12-04 16:46:21 +01:00
Raphael Mitsch
55ed2b4e82
Add documentation for EL task ( #12988 )
...
* Add documentation for EL task.
* Fix EL factory name.
* Add llm_entity_linker_mentio.
* Apply suggestions from code review
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
* Update EL task docs.
* Update EL task docs.
* Update EL task docs.
* Update EL task docs.
* Update EL task docs.
* Update EL task docs.
* Update EL task docs.
* Update EL task docs.
* Update EL task docs.
* Apply suggestions from code review
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Incorporate feedback.
* Format.
* Fix link to KB data.
---------
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2023-12-04 15:23:28 +01:00
Adriane Boyd
e467573550
Docs: update trf_data examples and pipeline design info ( #13164 )
2023-12-04 15:15:54 +01:00
Raphael Mitsch
0e43fca036
Add Claude-2.1 mention. ( #13167 )
2023-12-01 16:48:35 +01:00
Daniël de Kok
da7ad97519
Update TextCatBOW
to use the fixed SparseLinear
layer ( #13149 )
...
* Update `TextCatBOW` to use the fixed `SparseLinear` layer
A while ago, we fixed the `SparseLinear` layer to use all available
parameters: https://github.com/explosion/thinc/pull/754
This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.
While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.
* Replace TexCatBOW `length_exponent` parameter by `length`
We now round up the length to the next power of two if it isn't
a power of two.
* Remove some tests for TextCatBOW.v2
* Fix missing import
2023-11-29 09:11:54 +01:00
Ines Montani
bf7c2ea99a
Add merch link [ci skip]
2023-11-22 12:55:00 +01:00
Ines Montani
8f69e56a5a
Add swag [ci skip]
2023-11-20 14:42:01 +01:00
Lise
b6e022381d
Feature/nn and fo language extensions ( #13116 )
...
* add language extensions for norwegian nynorsk and faroese
* update docstring for nn/examples.py
* use relative imports
* add fo and nn tokenizers to pytest fixtures
* add unittests for fo and nn and fix bug in nn
* remove module docstring from fo/__init__.py
* add comments about example sentences' origin
* add license information to faroese data credit
* format unittests using black
* add __init__ files to test/lang/nn and tests/lang/fo
* fix import order and use relative imports in fo/__nit__.py and nn/__init__.py
* Make the tests a bit more compact
* Add fo and nn to website languages
* Add note about jul.
* Add "jul." as exception
---------
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2023-11-20 07:49:59 +01:00
ajbond
9f2ce6bb00
Add Redfield NLP Nodes to the Spacy Universe ( #13133 )
2023-11-17 09:48:02 +01:00
Madeesh Kannan
bd2c17e206
Warn about reloading dependencies after downloading models ( #13081 )
...
* Update the "Missing factory" error message
This accounts for model installations that took place during the current Python session.
* Add a note about Jupyter notebooks
* Move error to `spacy.cli.download`
Add extra message for Jupyter sessions
* Add additional note for interactive sessions
* Remove note about `spacy-transformers` from error message
* `isort`
* Improve checks for colab (also helps displacy)
* Update warning messages
* Improve flow for multiple checks
---------
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2023-11-10 08:05:07 +01:00
Raphael Mitsch
b2e831d966
LLM docs: OpenAI model update ( #13119 )
...
* Update supported OpenAI models.
* Update with new GPT-3.5 and GPT-4 versions.
* Add links to OpenAI model docs.
2023-11-08 17:55:16 +01:00
Adriane Boyd
513bbd5fa3
Add preferred use of build for package CLI ( #13109 )
...
Build with `build` if available. Warn and fall back to previous
`setup.py`-based builds if `build` build fails.
2023-11-08 17:35:24 +01:00
Ridge Kimani
2b8da84717
feat: add extra lexical attributes ( #13106 )
...
Co-authored-by: Ridge Kimani <ridgekimani@gmail.com>
2023-11-08 17:29:11 +01:00
Adriane Boyd
0c25725359
Update Tokenizer.explain for special cases with whitespace ( #13086 )
...
* Update Tokenizer.explain for special cases with whitespace
Update `Tokenizer.explain` to skip special case matches if the exact
text has not been matched due to intervening whitespace.
Enable fuzzy `Tokenizer.explain` tests with additional whitespace
normalization.
* Add unit test for special cases with whitespace, xfail fuzzy tests again
2023-11-06 17:29:59 +01:00
Adriane Boyd
ff9ddb6a07
Unskip python 3.12 remote tests ( #13110 )
2023-11-06 11:59:45 +01:00
Adriane Boyd
c096c5c0c9
Update for numpy 2.0 deprecations ( #13103 )
...
- Replace `np.trapz` with vendored `trapezoid` from scipy
- Replace `np.float_` with `np.float64`
2023-11-06 08:47:53 +01:00
Adriane Boyd
92f1d0a195
CI: Switch to stable python 3.12 and limit 3.11 runs ( #13104 )
2023-11-03 15:46:03 +01:00
Raphael Mitsch
c4e2daf6ef
Fix displacy span stacking ( #13068 )
...
* Fix displacy span stacking.
* Format. Remove counter.
* Remove test files.
* Add unit test. Refactor to allow for unit test.
* Fix off-by-one error in tests.
2023-11-02 12:02:18 +01:00
Sofie Van Landeghem
a804b83a4b
Update llm docs to clarify task-specific factories ( #13082 )
...
* fix typo
* add examples to specify custom model for task-specific factory
2023-10-31 22:07:07 +01:00
Sofie Van Landeghem
48248c62b6
Clarify EL example in docs ( #13071 )
...
* add comment that pipeline is a custom one
* add link to NEL tutorial
* prettier
* revert prettier reformat
* revert prettier reformat (2)
* fix typo
Co-authored-by: Raphael Mitsch <r.mitsch@outlook.com>
---------
Co-authored-by: Raphael Mitsch <r.mitsch@outlook.com>
2023-10-31 21:58:29 +01:00
Raphael Mitsch
0c15876502
Fix spancat typo. ( #13095 )
2023-10-31 13:45:10 +01:00
Raphael Mitsch
9deaac9786
Add note in docs on score_weight
config if using a non-default spans_key
for SpanCat ( #13093 )
...
* Add note on score_weight if using a non-default span_key for SpanCat.
* Fix formatting.
* Fix formatting.
* Fix typo.
* Use warning infobox.
* Fix infobox formatting.
2023-10-30 17:02:08 +01:00
Sofie Van Landeghem
d717123819
Update LICENSE ( #13078 )
2023-10-23 11:59:18 +02:00
Adriane Boyd
a89eae9283
Set version to v3.7.2 ( #13066 )
2023-10-16 15:10:55 +02:00
Sofie Van Landeghem
699dd8b3b7
Update __all__ fields ( #13063 )
...
* update all for pipeline.init
* add all in training.init
* add all in kb.init
* alphabetically
2023-10-16 10:17:47 +02:00
Adriane Boyd
ea1befa8ff
Support Any comparisons for Token and Span ( #13058 )
...
* Support Any comparisons for Token and Span
* Preserve previous behavior for None
2023-10-12 11:53:33 +02:00
Raphael Mitsch
d72029d9c8
Add binary examples for Textcat task in spacy-llm
( #13051 )
...
* Add examples for binary classification.
* Fix example.
* Remove binary textcat example. Format.
* Rephrase.
2023-10-11 12:23:38 +02:00
Adriane Boyd
77c568e524
Restore spacy.cli.project API ( #13053 )
...
* Restore spacy.cli.project API
* Fix typing errors, add simple import test
2023-10-10 15:35:25 +02:00
Ines Montani
65e7bd54f5
Update usage sidebar and nav alert [ci skip]
2023-10-06 14:36:37 +02:00
Ines Montani
b83f1e3724
Inline displaCy visualizations in docs ( #13050 ) [ci skip]
2023-10-06 14:22:43 +02:00
Raphael Mitsch
df07c4734b
Merge pull request #13046 from explosion/docs/llm_main
...
Sync `docs/llm_develop` with `docs/llm_main`
2023-10-05 16:31:20 +02:00
Raphael Mitsch
030d63ad73
Merge pull request #13045 from explosion/master
...
Sync `docs/llm_main` with `master`
2023-10-05 16:28:19 +02:00
Raphael Mitsch
be29216fe2
Merge pull request #13044 from explosion/docs/llm_main
...
Sync `master` with `docs/llm_main`
2023-10-05 16:10:19 +02:00
Raphael Mitsch
1162fcf099
Add Mistral mentions. ( #13037 )
2023-10-05 14:44:38 +02:00
Raphael Mitsch
862f8254e8
Add docs on Azure OpenAI support in spacy-llm
( #13043 )
...
* Add gpt-3.5-turbo-instruct to list of supported OpenAI models.
* Update `spacy-llm` task argument docs w.r.t. task refactoring (#12995 )
* Update task arguments w.r.t. task refactoring in 0.5.0.
* Add disclaimer w.r.t. gated models/Llama 2.
* Update website/docs/api/large-language-models.mdx
* Update website/docs/api/large-language-models.mdx
* Update docs w.r.t. PaLM support. (#13018 )
* Add info on spacy.Azure.v1.
* Attempt to fix netlify check fails.
* Attempt to fix netlify check fails.
* Attempt to fix netlify check fails.
* Attempt to fix netlify check fails.
* Attempt to fix netlify check fails.
* Attempt to fix netlify check fails.
* Attempt to fix netlify check fails.
* Attempt to fix netlify check fails.
* Attempt to fix netlify check fails.
* Format.
2023-10-05 13:18:27 +02:00
Raphael Mitsch
1dec138e61
Update docs w.r.t. PaLM support. ( #13018 )
2023-10-05 08:50:41 +02:00
Adriane Boyd
6e54360a3d
Remove pathy dependency, update docs for cloudpathlib in Weasel ( #13035 )
2023-10-05 08:50:22 +02:00
Raphael Mitsch
734826db79
Update spacy-llm
task argument docs w.r.t. task refactoring ( #12995 )
...
* Update task arguments w.r.t. task refactoring in 0.5.0.
* Add disclaimer w.r.t. gated models/Llama 2.
* Update website/docs/api/large-language-models.mdx
* Update website/docs/api/large-language-models.mdx
2023-10-05 08:45:25 +02:00
Raphael Mitsch
829613b959
Merge pull request #12999 from rmitsch/docs/gpt-3.5-turbo-instruct
...
Add `gpt-3.5-turbo-instruct` to list of supported OpenAI models
2023-10-05 08:41:07 +02:00
Adriane Boyd
9d036607f1
Set version to v3.7.1 ( #13042 )
2023-10-04 18:13:12 +02:00
Adriane Boyd
aec59c0088
Merge pull request #13040 from adrianeboyd/revert/12962-spacy-info
...
Revert "Load the cli module lazily for spacy.info (#12962 )"
2023-10-04 17:20:32 +02:00
Adriane Boyd
6d0185f7fb
Revert "Load the cli module lazily for spacy.info ( #12962 )"
...
This reverts commit beda27a91e
.
2023-10-04 12:33:33 +02:00
Adriane Boyd
92ce32aa3f
Update binder version to v3.7 ( #13034 )
2023-10-02 12:53:46 +02:00
Adriane Boyd
160e61772e
Docs for v3.7.0 ( #13029 )
...
* Docs for v3.7.0
* Minor fixes
* Extend Weasel notes
* Minor edits
* Update version in README
2023-10-01 21:40:07 +02:00