Sofie Van Landeghem
5d54c0e32a
Rename modules for consistency ( #11286 )
...
* rename Python module to entity_ruler
* rename Python module to attribute_ruler
2022-08-10 11:44:05 +02:00
Daniël de Kok
1ff683a50b
Merge remote-tracking branch 'upstream/master' into merge-master-v4-20220728
2022-07-28 13:53:59 +02:00
ninjalu
95a1b8aca6
add additional REL_OP ( #10371 )
...
* add additional REL_OP
* change to condition and new rel_op symbols
* add operators to docs
* add the anchor while we're in here
* add tests
Co-authored-by: Peter Baumgartner <5107405+pmbaumgartner@users.noreply.github.com>
2022-07-27 13:16:44 +02:00
Paul O'Leary McCann
1c12812d1a
Replace link to old label ( #11188 )
2022-07-25 16:39:34 +09:00
Adriane Boyd
7a99fe3c65
Move sent-patterns to correct section of universe.json ( #11192 )
2022-07-25 09:14:50 +02:00
0xpeIpeI
93960dc4b5
[universe project] create English interpretation project ( #11184 )
...
* [add] my universe project setting
* [modify] A few adjustments
* [Modify] change package description
2022-07-24 19:01:04 +09:00
Dan Radenkovic
a5aa3a818f
fix docs ( #11123 )
2022-07-24 17:16:36 +09:00
Lucas Terriel
7ff52c02a1
Update meta for spacyfishing in spaCy Universe ( #11185 )
...
* add new logo for spacyfishing to update spacy universe
* change logo location
2022-07-24 17:10:29 +09:00
Maarten Grootendorst
1caa2d1d16
Added BERTopic to Spacy Universe ( #11159 )
...
* Added BERTopic to Spacy Universe
* Fix no render of visualization
2022-07-19 19:37:18 +09:00
Madeesh Kannan
ba18d2913d
Morphology
/Morphologizer
optimizations and refactoring (#11024 )
...
* `Morphology`: Refactor to use C types, reduce allocations, remove unused code
* `Morphologzier`: Avoid unnecessary sorting of morpho features
* `Morphologizer`: Remove execessive reallocations of labels, improve hash lookups of labels, coerce `numpy` numeric types to native ints
Update docs
* Remove unused method
* Replace `unique_ptr` usage with `shared_ptr`
* Add type annotations to internal Python methods, rename `hash` variable, fix typos
* Add comment to clarify implementation detail
* Fix return type
* `Morphology`: Stop early when splitting fields and values
2022-07-15 11:14:08 +02:00
Adriane Boyd
2235e3520c
Update binder version in docs ( #11124 )
2022-07-12 15:20:33 +02:00
Adriane Boyd
11f859c132
Docs for v3.4 ( #11057 )
...
* Add draft of v3.4 usage
* Add Croatian models
* Add Matcher min/max
* Update release notes
* Minor edits
* Add updates, tables
* Update pydantic/mypy versions
* Update version in README
* Fix sidebar
2022-07-11 15:36:31 +02:00
Adriane Boyd
3701039c1f
Tweak build jobs setting, update install docs ( #11077 )
...
* Restrict SPACY_NUM_BUILD_JOBS to only override if set
* Update install docs
2022-07-08 19:21:17 +02:00
Richard Hudson
dc38a0f079
Change demo URL ( #11102 )
2022-07-08 19:19:48 +02:00
Adriane Boyd
be9e17c0e4
Add docs for compiling with build constraints ( #11081 )
2022-07-08 11:45:56 +02:00
Nipun Sadvilkar
bb3e11b9a1
Github Action for spaCy universe project alert ( #11090 )
2022-07-07 17:50:30 +05:30
Kenneth Enevoldsen
7b220afc29
Added asent to spacy universe ( #11078 )
...
* Added asent to spacy universe
* Update addition of asent following correction
2022-07-07 13:25:25 +09:00
Schero1994
c7c3fb1d0c
Merge pull request #11074 from Schero1994/feature/remove
...
Batch #2 | spaCy universe cleanup
2022-07-06 10:39:04 +02:00
Raphael Mitsch
e9eb59699f
NEL confidence threshold ( #11016 )
...
* Add base for NEL abstention threshold mechanism.
* Add abstention threshold to entity linker. Add test.
* Fix entity linking tests.
* Changed abstention default threshold from 0 to None.
* Fix default values for abstention thresholds.
* Fix mypy errors.
* Replace assertion with raise of proper error code.
* Simplify threshold check. Remove thresholding from EntityLinker_v1.
* Rename test.
* Update spacy/pipeline/entity_linker.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update spacy/pipeline/entity_linker.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Make E1043 configurable.
* Update docs.
* Rephrase description in docs. Adjusting error code message.
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-07-04 17:05:21 +02:00
schaeran
b3165db41b
remove universe object: spacy-langdetect
2022-07-04 16:07:18 +02:00
schaeran
4e8a5994df
remove universe object: NLPre
2022-07-04 16:06:58 +02:00
schaeran
0e4a835468
remove universe object: num_fh
2022-07-04 16:06:38 +02:00
schaeran
5000a08a20
remove universe object: adam_qas
2022-07-04 16:06:20 +02:00
schaeran
60a35a2bb2
remove universe object: spacy_kenlm
2022-07-04 16:06:02 +02:00
schaeran
224f30c563
remove universe object: spacy-raspberry
2022-07-04 16:05:34 +02:00
schaeran
a9062ebf17
remove universe object: spacy-lookup
2022-07-04 16:05:11 +02:00
schaeran
9b823fc9e9
remove universe object: NeuroNER
2022-07-04 16:04:50 +02:00
schaeran
b94bcaa62f
remove universe object: spacy-vis
2022-07-04 16:04:29 +02:00
schaeran
880e7db44e
remove universe object: spacy_grammar
2022-07-04 16:04:06 +02:00
schaeran
6c036d1e25
remove universe object: spacy_hunspell
2022-07-04 16:03:30 +02:00
Paul O'Leary McCann
e8fdbfc65e
Minor fix in Lemmatizer docs
2022-07-01 14:28:03 +09:00
Adriane Boyd
3bc1fe0a78
Update cupy extras ( #11055 )
...
* Add cuda116 and cuda117 extras
* Revert "remove `cuda116` extra from install widget (#11012 )"
This reverts commit e7b498fb1f
.
* Add cuda117 to quickstart
2022-06-30 11:24:37 +02:00
Shen Qin
be00db6645
Addition of min_max quantifier in matcher {n,m} ( #10981 )
...
* Min_max_operators
1. Modified API and Usage for spaCy website to include min_max operator
2. Modified matcher.pyx to include min_max function {n,m} and its variants
3. Modified schemas.py to include min_max validation error
4. Added test cases to test_matcher_api.py, test_matcher_logic.py and test_pattern_validation.py
* attempt to fix mypy/pydantic compat issue
* formatting
* Update spacy/tests/matcher/test_pattern_validation.py
Co-authored-by: Source-Shen <82353723+Source-Shen@users.noreply.github.com>
Co-authored-by: svlandeg <svlandeg@github.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-06-30 11:01:58 +02:00
Eric Holscher
308a612ec9
Remove simply
( #11017 )
...
I was reading this page, and as a relative beginner, nothing about it was simple :)
2022-06-27 09:45:22 +02:00
Dmytro Sadovnychyi
4cd8b4cc22
Fix some of the broken links on universe pages ( #11011 )
...
Currently some of the "AUTHOR INFO" links (e.g. here[0]) are broken:
```
https://github.com/https://github.com/explosion
```
[0] https://spacy.io/universe/project/spacy-experimental
Also one remains broken with `https://szegedai.github.io/ `.
2022-06-23 17:53:00 +02:00
Adriane Boyd
f1197d9175
Add API docs for token attribute symbols ( #10836 )
...
* Add API docs for token attribute symbols
* Remove NBSP's
* Fix typo
* Rephrase
Co-authored-by: svlandeg <svlandeg@github.com>
2022-06-23 08:16:38 +02:00
Peter Baumgartner
3335bb9d0c
remove cuda116
extra from install widget ( #11012 )
2022-06-23 08:15:28 +02:00
jademlc
bed23ff291
Update serialization methods code block ( #11004 )
...
* Update serialization methods code block
* Update website/docs/usage/saving-loading.md
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-06-22 20:45:26 +02:00
Sofie Van Landeghem
0fa004c4cd
the 'new' indicator wants a 'number' ( #10997 )
2022-06-21 22:01:16 +02:00
Philip Vollet
1ae13b2a70
Merge pull request #10991 from Lucaterre/master
...
updated spacy universe for spacyfishing
2022-06-21 10:33:26 +02:00
Victoria
a08ca064e5
Update linguistic-features.md ( #10993 )
...
Change link for downloading fasttext word vectors
2022-06-21 15:03:41 +09:00
Lucaterre
2820d7dd8d
correct typo in universe.json for 'code_example' key : pipe name 'entityfishing'
2022-06-20 15:26:23 +02:00
Lucaterre
cdad815c68
updated spacy universe for spacyfishing
2022-06-20 14:28:49 +02:00
Raphael Mitsch
4c058eb40a
enable
argument for spacy.load() (#10784 )
...
* Enable flag on spacy.load: foundation for include, enable arguments.
* Enable flag on spacy.load: fixed tests.
* Enable flag on spacy.load: switched from pretrained model to empty model with added pipes for tests.
* Enable flag on spacy.load: switched to more consistent error on misspecification of component activity. Test refactoring. Added to default config.
* Enable flag on spacy.load: added support for fields not in pipeline.
* Enable flag on spacy.load: removed serialization fields from supported fields.
* Enable flag on spacy.load: removed 'enable' from config again.
* Enable flag on spacy.load: relaxed checks in _resolve_component_activation_status() to allow non-standard pipes.
* Enable flag on spacy.load: fixed relaxed checks for _resolve_component_activation_status() to allow non-standard pipes. Extended tests.
* Enable flag on spacy.load: comments w.r.t. resolution workarounds.
* Enable flag on spacy.load: remove include fields. Update website docs.
* Enable flag on spacy.load: updates w.r.t. changes in master.
* Implement Doc.from_json(): update docstrings.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): remove newline.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): change error message for E1038.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Enable flag on spacy.load: wrapped docstring for _resolve_component_status() at 80 chars.
* Enable flag on spacy.load: changed exmples for enable flag.
* Remove newline.
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Fix docstring for Language._resolve_component_status().
* Rename E1038 to E1042.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-06-17 20:24:13 +01:00
Gor Arakelyan
605f84938b
Add "Aim-spaCy" to spaCy Universe ( #10943 )
...
* Add Aim-spaCy to spaCy universe
* Update Aim thumbnail
* Fix author links
Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
2022-06-10 18:33:17 +09:00
Paul O'Leary McCann
d176afd32f
Add note about multiple patterns ( #10826 )
...
* Add note about multiple patterns
* Move note to the top of method docs
* Remove EntityRuler note
2022-06-08 16:24:14 +02:00
Sofie Van Landeghem
763dcbf885
Fix version in SpanRuler docs ( #10925 )
...
* SpanRuler is new since 3.3.1
* update SpanRuler version since 3.3.1
2022-06-08 14:45:04 +02:00
Ilya Nikitin
c323789721
token.md
: Fix documentation of Token.ancestors
(#10917 )
2022-06-06 14:32:36 +09:00
vincent d warmerdam
e7d2b26966
Add spacy-report to universe ( #10910 )
...
* Add spacy-report to universe
* Remove extra comma
Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
2022-06-05 18:57:58 +09:00
Raphael Mitsch
8387ce4c01
Add Doc.from_json() ( #10688 )
...
* Implement Doc.from_json: rough draft.
* Implement Doc.from_json: first draft with tests.
* Implement Doc.from_json: added documentation on website for Doc.to_json(), Doc.from_json().
* Implement Doc.from_json: formatting changes.
* Implement Doc.to_json(): reverting unrelated formatting changes.
* Implement Doc.to_json(): fixing entity and span conversion. Moving fixture and doc <-> json conversion tests into single file.
* Implement Doc.from_json(): replaced entity/span converters with doc.char_span() calls.
* Implement Doc.from_json(): handling sentence boundaries in spans.
* Implementing Doc.from_json(): added parser-free sentence boundaries transfer.
* Implementing Doc.from_json(): added parser-free sentence boundaries transfer.
* Implementing Doc.from_json(): incorporated various PR feedback.
* Renaming fixture for document without dependencies.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implementing Doc.from_json(): using two sent_starts instead of one.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implementing Doc.from_json(): doc_without_dependency_parser() -> doc_without_deps.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implementing Doc.from_json(): incorporating various PR feedback. Rebased on latest master.
* Implementing Doc.from_json(): refactored Doc.from_json() to work with annotation IDs instead of their string representations.
* Implement Doc.from_json(): reverting unwanted formatting/rebasing changes.
* Implement Doc.from_json(): added check for char_span() calculation for entities.
* Update spacy/tokens/doc.pyx
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): minor refactoring, additional check for token attribute consistency with corresponding test.
* Implement Doc.from_json(): removed redundancy in annotation type key naming.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): Simplifying setting annotation values.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement doc.from_json(): renaming annot_types to token_attrs.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): adjustments for renaming of annot_types to token_attrs.
* Implement Doc.from_json(): removing default categories.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): simplifying lexeme initialization.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): simplifying lexeme initialization.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): refactoring to only have keys for present annotations.
* Implement Doc.from_json(): fix check for tokens' HEAD attributes.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): refactoring Doc.from_json().
* Implement Doc.from_json(): fixing span_group retrieval.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): fixing span retrieval.
* Implement Doc.from_json(): added schema for Doc JSON format. Minor refactoring in Doc.from_json().
* Implement Doc.from_json(): added comment regarding Token and Span extension support.
* Implement Doc.from_json(): renaming inconsistent_props to partial_attrs..
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): adjusting error message.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): extending E1038 message.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): added params to E1038 raises.
* Implement Doc.from_json(): combined attribute collection with partial attributes check.
* Implement Doc.from_json(): added optional schema validation.
* Implement Doc.from_json(): fixed optional fields in schema, tests.
* Implement Doc.from_json(): removed redundant None check for DEP.
* Implement Doc.from_json(): added passing of schema validatoin message to E1037..
* Implement Doc.from_json(): removing redundant error E1040.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): changing message for E1037.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): adjusted website docs and docstring of Doc.from_json().
* Update spacy/tests/doc/test_json_doc_conversion.py
* Implement Doc.from_json(): docstring update.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): docstring update.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): website docs update.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): docstring formatting.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): docstring formatting.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): fixing Doc reference in website docs.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): reformatted website/docs/api/doc.md.
* Implement Doc.from_json(): bumped IDs of new errors to avoid merge conflicts.
* Implement Doc.from_json(): fixing bug in tests.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Implement Doc.from_json(): fix setting of sentence starts for docs without DEP.
* Implement Doc.from_json(): add check for valid char spans when manually setting sentence boundaries. Refactor sentence boundary setting slightly. Move error message for lack of support for partial token annotations to errors.py.
* Implement Doc.from_json(): simplify token sentence start manipulation.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Combine related error messages
* Update spacy/tests/doc/test_json_doc_conversion.py
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-06-02 14:03:47 +02:00