mirror of
				https://github.com/explosion/spaCy.git
				synced 2025-10-26 13:41:21 +03:00 
			
		
		
		
	
			
				
					
						
					
					7b36e7c9ec
				
			
			
		
	
	
		
			3 Commits
		
	
	
	| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|  | d1474fdd91 | add explanation about overwriting behaviour (#12464) * add explanation about overwriting behaviour * Update website/docs/api/spancategorizer.mdx Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update website/docs/api/spancategorizer.mdx Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update website/docs/api/spancategorizer.mdx Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * format --------- Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> | ||
|  | 913d74f509 | Add spancat_singlelabel pipeline for multiclass and non-overlapping span labelling tasks (#11365) * [wip] Update
* [wip] Update
* Add initial port
* [wip] Update
* Fix all imports
* Add spancat_exclusive to pipeline
* [WIP] Update
* [ci skip] Add breakpoint for debugging
* Use spacy.SpanCategorizer.v1 as default archi
* Update spacy/pipeline/spancat_exclusive.py
Co-authored-by: kadarakos <kadar.akos@gmail.com>
* [ci skip] Small updates
* Use Softmax v2 directly from thinc
* Cache the label map
* Fix mypy errors
However, I ignored line 370 because it opened up a bunch of type errors
that might be trickier to solve and might lead to a more complicated
codebase.
* avoid multiplication with 1.0
Co-authored-by: kadarakos <kadar.akos@gmail.com>
* Update spacy/pipeline/spancat_exclusive.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update component versions to v2
* Add scorer to docstring
* Add _n_labels property to SpanCategorizer
Instead of using len(self.labels) in initialize() I am using a private
property self._n_labels. This achieves implementation parity and allows
me to delete the whole initialize() method for spancat_exclusive (since
it's now the same with spancat).
* Inherit from SpanCat instead of TrainablePipe
This commit changes the inheritance structure of Exclusive_Spancat,
now it's inheriting from SpanCategorizer than TrainablePipe. This
allows me to remove duplicate methods that are already present in
the parent function.
* Revert documentation link to spancat
* Fix init call for exclusive spancat
* Update spacy/pipeline/spancat_exclusive.py
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Import Suggester from spancat
* Include zero_init.v1 for spancat
* Implement _allow_extra_label to use _n_labels
To ensure that spancat / spancat_exclusive cannot be resized after
initialization, I inherited the _allow_extra_label() method from
spacy/pipeline/trainable_pipe.pyx and used self._n_labels instead
of len(self.labels) for checking.
I think that changing it locally is a better solution rather than
forcing each class that inherits TrainablePipe to use the self._n_labels
attribute.
Also note that I turned-off black formatting in this block of code
because it reads better without the overhang.
* Extend existing tests to spancat_exclusive
In this commit, I extended the existing tests for spancat to include
spancat_exclusive. I parametrized the test functions with 'name'
(similar var name with textcat and textcat_multilabel) for each
applicable test.
TODO: Add overfitting tests for spancat_exclusive
* Update documentation for spancat
* Turn on formatting for allow_extra_label
* Remove initializers in default config
* Use DEFAULT_EXCL_SPANCAT_MODEL
I also renamed spancat_exclusive_default_config into
spancat_excl_default_config because black does some not pretty
formatting changes.
* Update documentation
Update grammar and usage
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Clarify docstring for Exclusive_SpanCategorizer
* Remove mypy ignore and typecast labels to list
* Fix documentation API
* Use a single variable for tests
* Update defaults for number of rows
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Put back initializers in spancat config
Whenever I remove model.scorer.init_w and model.scorer.init_b,
I encounter an error in the test:
    SystemError: <method '__getitem__' of 'dict' objects> returned a result
    with an error set.
My Thinc version is 8.1.5, but I can't seem to check what's causing the
error.
* Update spancat_exclusive docstring
* Remove init_W and init_B parameters
This commit is expected to fail until the new Thinc release.
* Require thinc>=8.1.6 for serializable Softmax defaults
* Handle zero suggestions to make tests pass
I'm not sure if this is the most elegant solution. But what should
happen is that the _make_span_group function MUST return an empty
SpanGroup if there are no suggestions.
The error happens when the 'scores' variable is empty. We cannot
get the 'predicted' and other downstream vars.
* Better approach for handling zero suggestions
* Update website/docs/api/spancategorizer.md
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Update spancategorizer headers
* Apply suggestions from code review
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Add default value in negative_weight in docs
* Add default value in allow_overlap in docs
* Update how spancat_exclusive is constructed
In this commit, I added the following:
- Put the default values of negative_weight and allow_overlap
    in the default_config dictionary.
- Rename make_spancat -> make_exclusive_spancat
* Run prettier on spancategorizer.mdx
* Change exactly one -> at most one
* Add suggester documentation in Exclusive_SpanCategorizer
* Add suggester to spancat docstrings
* merge multilabel and singlelabel spancat
* rename spancat_exclusive to singlelable
* wire up different make_spangroups for single and multilabel
* black
* black
* add docstrings
* more docstring and fix negative_label
* don't rely on default arguments
* black
* remove spancat exclusive
* replace single_label with add_negative_label and adjust inference
* mypy
* logical bug in configuration check
* add spans.attrs[scores]
* single label make_spangroup test
* bugfix
* black
* tests for make_span_group with negative labels
* refactor make_span_group
* black
* Update spacy/tests/pipeline/test_spancat.py
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* remove duplicate declaration
* Update spacy/pipeline/spancat.py
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* raise error instead of just print
* make label mapper private
* update docs
* run prettier
* Update website/docs/api/spancategorizer.mdx
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Update website/docs/api/spancategorizer.mdx
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Update spacy/pipeline/spancat.py
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Update spacy/pipeline/spancat.py
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Update spacy/pipeline/spancat.py
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Update spacy/pipeline/spancat.py
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* don't keep recomputing self._label_map for each span
* typo in docs
* Intervals to private and document 'name' param
* Update spacy/pipeline/spancat.py
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Update spacy/pipeline/spancat.py
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* add Tag to new features
* replace tags
* revert
* revert
* revert
* revert
* Update website/docs/api/spancategorizer.mdx
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Update website/docs/api/spancategorizer.mdx
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* prettier
* Fix merge
* Update website/docs/api/spancategorizer.mdx
* remove references to 'single_label'
* remove old paragraph
* Add spancat_singlelabel to config template
* Format
* Extend init config tests
---------
Co-authored-by: kadarakos <kadar.akos@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> | ||
|  | 554df9ef20 | Website migration from Gatsby to Next (#12058) * Rename all MDX file to `.mdx`
* Lock current node version (#11885)
* Apply Prettier (#11996)
* Minor website fixes (#11974) [ci skip]
* fix table
* Migrate to Next WEB-17 (#12005)
* Initial commit
* Run `npx create-next-app@13 next-blog`
* Install MDX packages
Following:  |