Adriane Boyd
740c33fe58
Merge remote-tracking branch 'upstream/develop' into chore/update-v4-from-develop
2022-08-24 20:43:07 +02:00
Adriane Boyd
81874265e9
Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.5-1
2022-08-24 12:47:42 +02:00
Adriane Boyd
c44d243f25
Merge remote-tracking branch 'upstream/master' into chore/update-v4-from-master
2022-08-24 07:15:41 +02:00
Tobius Saul
c09d2fa25b
luganda language extension ( #10847 )
...
* luganda language extension
* __init__.py changes
* New enhancements
* Lexical attribute changed
* punctuaction and sentence additions
* Remove comment header
* Fix typos, reformat
* reformated version
* Add tokenizer test
* Remove contractions from stop words
* Format
* Add Luganda to website
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-08-23 13:09:36 +02:00
Tal Zussman
7e75327893
Fix menu order in linguistic-features.md ( #11364 )
...
Swap 'Vectors & Similarity' and 'Mappings & Exceptions' in menu to match order in body
2022-08-23 14:40:38 +09:00
Adriane Boyd
bb0e178878
Make Span/Doc.ents more consistent for ent_kb_id and ent_id ( #11328 )
...
* Map `Span.id` to `Token.ent_id` in all cases when setting `Doc.ents`
* Reset `Token.ent_id` and `Token.ent_kb_id` when setting `Doc.ents`
* Make `Span.ent_id` an alias of `Span.id` rather than a read-only view
of the root token's `ent_id` annotation
2022-08-22 20:28:57 +02:00
Adriane Boyd
04c6e5cb95
Improve floret vectors display in pipeline docs ( #11343 )
2022-08-22 11:28:13 +02:00
Adriane Boyd
5fa8f4faca
Switch ru and uk lemmatizers to pymorphy3 ( #11345 )
...
* Switch ru and uk lemmatizers to pymorphy3
* Switch to pymorphy3 in tests
2022-08-22 11:27:14 +02:00
Adriane Boyd
09b3118b26
Add uk pipelines to website ( #11332 )
2022-08-18 14:04:57 +02:00
Peter Baumgartner
db7b9938a4
Docs: displaCy documentation - data types, parse_{deps,ents,spans}
, spans example ( #10950 )
...
* add in spans example and parse references
* rm autoformatter
* rm extra ents copy
* TypedDict draft
* type fixes
* restore non-documentation files
* docs update
* fix spans example
* fix hyperlinks
* add parse example
* example fix + argument fix
* fix api arg in docs
* fix bad variable replacement
* fix spacing in style
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* fix spacing on table
* fix spacing on table
* rm temp files
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-08-16 11:23:34 -04:00
Sofie Van Landeghem
5d54c0e32a
Rename modules for consistency ( #11286 )
...
* rename Python module to entity_ruler
* rename Python module to attribute_ruler
2022-08-10 11:44:05 +02:00
stefawolf
23749cfc91
adding spans to doc_annotation in Example.to_dict ( #11261 )
...
* adding spans to doc_annotation in Example.to_dict
* to_dict compatible with from_dict: tuples instead of spans
* use strings for label and kb_id
* Simplify test
* Update data formats docs
Co-authored-by: Stefanie Wolf <stefanie.wolf@vitecsoftware.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-08-05 12:26:38 +02:00
Jules Belveze
cd09614ab2
chore: add 'concepCy' to spacy universe ( #11255 )
...
* chore: add 'concepCy' to spacy universe
* docs: add 'slogan' to concepCy
2022-08-04 15:42:38 +09:00
Lj Miranda
d993df41e5
Update docs for pipeline initialize() methods ( #11221 )
...
* Update documentation for dependency parser
* Update documentation for trainable_lemmatizer
* Update documentation for entity_linker
* Update documentation for ner
* Update documentation for morphologizer
* Update documentation for senter
* Update documentation for spancat
* Update documentation for tagger
* Update documentation for textcat
* Update documentation for tok2vec
* Run prettier on edited files
* Apply similar changes in transformer docs
* Remove need to say annotated example explicitly
I removed the need to say "Must contain at least one annotated Example"
because it's often a given that Examples will contain some gold-standard
annotation.
* Run prettier on transformer docs
2022-08-03 16:53:02 +02:00
Adriane Boyd
d0578c2ede
Add scorer to textcat API docs config settings ( #11263 )
2022-08-03 16:41:20 +02:00
Daniël de Kok
1ff683a50b
Merge remote-tracking branch 'upstream/master' into merge-master-v4-20220728
2022-07-28 13:53:59 +02:00
ninjalu
95a1b8aca6
add additional REL_OP ( #10371 )
...
* add additional REL_OP
* change to condition and new rel_op symbols
* add operators to docs
* add the anchor while we're in here
* add tests
Co-authored-by: Peter Baumgartner <5107405+pmbaumgartner@users.noreply.github.com>
2022-07-27 13:16:44 +02:00
Paul O'Leary McCann
1c12812d1a
Replace link to old label ( #11188 )
2022-07-25 16:39:34 +09:00
Adriane Boyd
7a99fe3c65
Move sent-patterns to correct section of universe.json ( #11192 )
2022-07-25 09:14:50 +02:00
0xpeIpeI
93960dc4b5
[universe project] create English interpretation project ( #11184 )
...
* [add] my universe project setting
* [modify] A few adjustments
* [Modify] change package description
2022-07-24 19:01:04 +09:00
Dan Radenkovic
a5aa3a818f
fix docs ( #11123 )
2022-07-24 17:16:36 +09:00
Lucas Terriel
7ff52c02a1
Update meta for spacyfishing in spaCy Universe ( #11185 )
...
* add new logo for spacyfishing to update spacy universe
* change logo location
2022-07-24 17:10:29 +09:00
Maarten Grootendorst
1caa2d1d16
Added BERTopic to Spacy Universe ( #11159 )
...
* Added BERTopic to Spacy Universe
* Fix no render of visualization
2022-07-19 19:37:18 +09:00
Madeesh Kannan
ba18d2913d
Morphology
/Morphologizer
optimizations and refactoring (#11024 )
...
* `Morphology`: Refactor to use C types, reduce allocations, remove unused code
* `Morphologzier`: Avoid unnecessary sorting of morpho features
* `Morphologizer`: Remove execessive reallocations of labels, improve hash lookups of labels, coerce `numpy` numeric types to native ints
Update docs
* Remove unused method
* Replace `unique_ptr` usage with `shared_ptr`
* Add type annotations to internal Python methods, rename `hash` variable, fix typos
* Add comment to clarify implementation detail
* Fix return type
* `Morphology`: Stop early when splitting fields and values
2022-07-15 11:14:08 +02:00
Adriane Boyd
2235e3520c
Update binder version in docs ( #11124 )
2022-07-12 15:20:33 +02:00
Adriane Boyd
11f859c132
Docs for v3.4 ( #11057 )
...
* Add draft of v3.4 usage
* Add Croatian models
* Add Matcher min/max
* Update release notes
* Minor edits
* Add updates, tables
* Update pydantic/mypy versions
* Update version in README
* Fix sidebar
2022-07-11 15:36:31 +02:00
Adriane Boyd
3701039c1f
Tweak build jobs setting, update install docs ( #11077 )
...
* Restrict SPACY_NUM_BUILD_JOBS to only override if set
* Update install docs
2022-07-08 19:21:17 +02:00
Richard Hudson
dc38a0f079
Change demo URL ( #11102 )
2022-07-08 19:19:48 +02:00
Adriane Boyd
be9e17c0e4
Add docs for compiling with build constraints ( #11081 )
2022-07-08 11:45:56 +02:00
Nipun Sadvilkar
bb3e11b9a1
Github Action for spaCy universe project alert ( #11090 )
2022-07-07 17:50:30 +05:30
Kenneth Enevoldsen
7b220afc29
Added asent to spacy universe ( #11078 )
...
* Added asent to spacy universe
* Update addition of asent following correction
2022-07-07 13:25:25 +09:00
Schero1994
c7c3fb1d0c
Merge pull request #11074 from Schero1994/feature/remove
...
Batch #2 | spaCy universe cleanup
2022-07-06 10:39:04 +02:00
Raphael Mitsch
e9eb59699f
NEL confidence threshold ( #11016 )
...
* Add base for NEL abstention threshold mechanism.
* Add abstention threshold to entity linker. Add test.
* Fix entity linking tests.
* Changed abstention default threshold from 0 to None.
* Fix default values for abstention thresholds.
* Fix mypy errors.
* Replace assertion with raise of proper error code.
* Simplify threshold check. Remove thresholding from EntityLinker_v1.
* Rename test.
* Update spacy/pipeline/entity_linker.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update spacy/pipeline/entity_linker.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Make E1043 configurable.
* Update docs.
* Rephrase description in docs. Adjusting error code message.
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-07-04 17:05:21 +02:00
schaeran
b3165db41b
remove universe object: spacy-langdetect
2022-07-04 16:07:18 +02:00
schaeran
4e8a5994df
remove universe object: NLPre
2022-07-04 16:06:58 +02:00
schaeran
0e4a835468
remove universe object: num_fh
2022-07-04 16:06:38 +02:00
schaeran
5000a08a20
remove universe object: adam_qas
2022-07-04 16:06:20 +02:00
schaeran
60a35a2bb2
remove universe object: spacy_kenlm
2022-07-04 16:06:02 +02:00
schaeran
224f30c563
remove universe object: spacy-raspberry
2022-07-04 16:05:34 +02:00
schaeran
a9062ebf17
remove universe object: spacy-lookup
2022-07-04 16:05:11 +02:00
schaeran
9b823fc9e9
remove universe object: NeuroNER
2022-07-04 16:04:50 +02:00
schaeran
b94bcaa62f
remove universe object: spacy-vis
2022-07-04 16:04:29 +02:00
schaeran
880e7db44e
remove universe object: spacy_grammar
2022-07-04 16:04:06 +02:00
schaeran
6c036d1e25
remove universe object: spacy_hunspell
2022-07-04 16:03:30 +02:00
Paul O'Leary McCann
e8fdbfc65e
Minor fix in Lemmatizer docs
2022-07-01 14:28:03 +09:00
Adriane Boyd
3bc1fe0a78
Update cupy extras ( #11055 )
...
* Add cuda116 and cuda117 extras
* Revert "remove `cuda116` extra from install widget (#11012 )"
This reverts commit e7b498fb1f
.
* Add cuda117 to quickstart
2022-06-30 11:24:37 +02:00
Shen Qin
be00db6645
Addition of min_max quantifier in matcher {n,m} ( #10981 )
...
* Min_max_operators
1. Modified API and Usage for spaCy website to include min_max operator
2. Modified matcher.pyx to include min_max function {n,m} and its variants
3. Modified schemas.py to include min_max validation error
4. Added test cases to test_matcher_api.py, test_matcher_logic.py and test_pattern_validation.py
* attempt to fix mypy/pydantic compat issue
* formatting
* Update spacy/tests/matcher/test_pattern_validation.py
Co-authored-by: Source-Shen <82353723+Source-Shen@users.noreply.github.com>
Co-authored-by: svlandeg <svlandeg@github.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-06-30 11:01:58 +02:00
Eric Holscher
308a612ec9
Remove simply
( #11017 )
...
I was reading this page, and as a relative beginner, nothing about it was simple :)
2022-06-27 09:45:22 +02:00
Dmytro Sadovnychyi
4cd8b4cc22
Fix some of the broken links on universe pages ( #11011 )
...
Currently some of the "AUTHOR INFO" links (e.g. here[0]) are broken:
```
https://github.com/https://github.com/explosion
```
[0] https://spacy.io/universe/project/spacy-experimental
Also one remains broken with `https://szegedai.github.io/ `.
2022-06-23 17:53:00 +02:00
Adriane Boyd
f1197d9175
Add API docs for token attribute symbols ( #10836 )
...
* Add API docs for token attribute symbols
* Remove NBSP's
* Fix typo
* Rephrase
Co-authored-by: svlandeg <svlandeg@github.com>
2022-06-23 08:16:38 +02:00