spaCy

mirror of https://github.com/explosion/spaCy.git synced 2026-01-08 09:41:11 +03:00

Author	SHA1	Message	Date
Lj Miranda	9a35b24b48	Implement _allow_extra_label to use _n_labels To ensure that spancat / spancat_exclusive cannot be resized after initialization, I inherited the _allow_extra_label() method from spacy/pipeline/trainable_pipe.pyx and used self._n_labels instead of len(self.labels) for checking. I think that changing it locally is a better solution rather than forcing each class that inherits TrainablePipe to use the self._n_labels attribute. Also note that I turned-off black formatting in this block of code because it reads better without the overhang.	2022-11-18 13:48:18 +08:00
Lj Miranda	c9036a6d79	Include zero_init.v1 for spancat	2022-11-18 13:16:33 +08:00
Lj Miranda	e23034365a	Import Suggester from spancat	2022-11-18 12:34:44 +08:00
Lj Miranda	b667ab56a0	Update spacy/pipeline/spancat_exclusive.py Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2022-11-18 12:31:09 +08:00
Lj Miranda	7ac46058a2	Fix init call for exclusive spancat	2022-11-02 13:05:56 +08:00
Lj Miranda	7021dbaff3	Revert documentation link to spancat	2022-11-02 12:43:26 +08:00
Lj Miranda	8548e2c311	Inherit from SpanCat instead of TrainablePipe This commit changes the inheritance structure of Exclusive_Spancat, now it's inheriting from SpanCategorizer than TrainablePipe. This allows me to remove duplicate methods that are already present in the parent function.	2022-11-02 12:30:41 +08:00
Lj Miranda	bdf2a1d1fe	Add _n_labels property to SpanCategorizer Instead of using len(self.labels) in initialize() I am using a private property self._n_labels. This achieves implementation parity and allows me to delete the whole initialize() method for spancat_exclusive (since it's now the same with spancat).	2022-11-02 12:27:54 +08:00
Lj Miranda	023a1a6c04	Add scorer to docstring	2022-11-02 12:10:49 +08:00
Lj Miranda	60a8df7c5f	Merge branch 'add/exclusive-spancat' of github.com:ljvmiranda921/spaCy into add/exclusive-spancat	2022-10-26 11:09:03 +08:00
Lj Miranda	1533a4ef5a	Update component versions to v2	2022-10-26 11:08:49 +08:00
Lj Miranda	1b1afd2251	Update spacy/pipeline/spancat_exclusive.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2022-10-26 11:07:57 +08:00
Sofie Van Landeghem	95c5bfcc78	avoid multiplication with 1.0 Co-authored-by: kadarakos <kadar.akos@gmail.com>	2022-10-03 17:05:55 +02:00
Lj Miranda	2b7eb85e36	Fix mypy errors However, I ignored line 370 because it opened up a bunch of type errors that might be trickier to solve and might lead to a more complicated codebase.	2022-09-05 15:42:34 +08:00
Lj Miranda	dbfb3a7739	Cache the label map	2022-09-05 14:34:49 +08:00
Lj Miranda	2bbab641e9	Use Softmax v2 directly from thinc	2022-09-05 11:28:30 +08:00
Lj Miranda	43bf05275f	[ci skip] Small updates	2022-08-25 16:26:03 +08:00
Lj Miranda	b728eaae18	Update spacy/pipeline/spancat_exclusive.py Co-authored-by: kadarakos <kadar.akos@gmail.com>	2022-08-25 16:08:15 +08:00
Lj Miranda	826c1d3ca3	Use spacy.SpanCategorizer.v1 as default archi	2022-08-25 13:31:36 +08:00
Lj Miranda	d6e56b62b9	[ci skip] Add breakpoint for debugging	2022-08-25 13:23:15 +08:00
Lj Miranda	5452e71b05	[WIP] Update	2022-08-25 13:08:37 +08:00
Lj Miranda	3d07c05cba	Add spancat_exclusive to pipeline	2022-08-25 12:40:48 +08:00
Lj Miranda	527a1818e5	Fix all imports	2022-08-25 11:24:37 +08:00
Lj Miranda	1db65b8e78	[wip] Update	2022-08-24 17:54:34 +08:00
Lj Miranda	6f08d83731	Add initial port	2022-08-24 16:47:56 +08:00
Lj Miranda	e7e845b5ed	[wip] Update	2022-08-24 11:35:26 +08:00
Lj Miranda	176ef9840e	[wip] Update	2022-08-24 11:20:22 +08:00
Edward	5afa98aabf	Support custom attributes for tokens and spans in json conversion (#11125 ) * Add token and span custom attributes to to_json() * Change logic for to_json * Add functionality to from_json * Small adjustments * Move token/span attributes to new dict key * Fix test * Fix the same test but much better * Add backwards compatibility tests and adjust logic * Add test to check if attributes not set in underscore are not saved in the json * Add tests for json compatibility * Adjust test names * Fix tests and clean up code * Fix assert json tests * small adjustment * adjust naming and code readability * Adjust naming, added more tests and changed logic * Fix typo * Adjust errors, naming, and small test optimization * Fix byte tests * Fix bytes tests * Change naming and json structure * update schema * Update spacy/schemas.py Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update spacy/tokens/doc.pyx Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update spacy/tokens/doc.pyx Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update spacy/schemas.py Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update schema for underscore attributes * Adjust underscore schema * adjust schema tests Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2022-08-23 10:05:02 +02:00
Tal Zussman	7e75327893	Fix menu order in linguistic-features.md (#11364 ) Swap 'Vectors & Similarity' and 'Mappings & Exceptions' in menu to match order in body	2022-08-23 14:40:38 +09:00
Sofie Van Landeghem	6e20842370	dev docs: numeric comparators (#11334 ) * add section on numeric comparators * edit * prettier * Update extra/DEVELOPER_DOCS/Code Conventions.md Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * note on typing imports Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2022-08-22 15:52:53 +02:00
Adriane Boyd	f55bb7470d	Clean up warnings in the test suite (#11331 )	2022-08-22 12:04:30 +02:00
Paul O'Leary McCann	0f07defe2c	Remove reference to voting on issue (#11335 ) Not clear which issue this refers to, we don't suggest this for any other issues, and we don't use votes in general.	2022-08-22 11:29:05 +02:00
Adriane Boyd	04c6e5cb95	Improve floret vectors display in pipeline docs (#11343 )	2022-08-22 11:28:13 +02:00
Adriane Boyd	3e4cf1bbe1	Check for . in factory names (#11336 )	2022-08-19 09:52:12 +02:00
Adriane Boyd	09b3118b26	Add uk pipelines to website (#11332 )	2022-08-18 14:04:57 +02:00
Sofie Van Landeghem	cab263791f	include span_ruler for default warning filter (#11333 )	2022-08-17 19:55:54 +02:00
Peter Baumgartner	db7b9938a4	Docs: displaCy documentation - data types, `parse_{deps,ents,spans}`, spans example (#10950 ) * add in spans example and parse references * rm autoformatter * rm extra ents copy * TypedDict draft * type fixes * restore non-documentation files * docs update * fix spans example * fix hyperlinks * add parse example * example fix + argument fix * fix api arg in docs * fix bad variable replacement * fix spacing in style Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * fix spacing on table * fix spacing on table * rm temp files Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2022-08-16 11:23:34 -04:00
Adriane Boyd	ed4ad309e6	Fix Dutch noun chunks to skip overlapping spans (#11275 ) * Add test for overlapping noun chunks * Skip overlapping noun chunks * Update spacy/tests/lang/nl/test_noun_chunks.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2022-08-10 09:49:08 +02:00
Paul O'Leary McCann	231a17817d	Clean up automated label-based issue handling (#11284 ) * Clean up automated label-based issue handline 1. upgrade tiangolo/issue-manager to latest 2. move needs-more-info to tiangolo 3. change needs-more-info close time to 7 days 4. delete old needs-more-info config * Use old, longer message * Fix label name	2022-08-09 14:50:50 +02:00
Adriane Boyd	e700358ba0	Add W605 to the errors raised by flake8 in the CI (#11283 )	2022-08-09 12:15:13 +02:00
Adriane Boyd	fc4246558b	Fix regex invalid escape sequences (#11276 )	2022-08-09 10:59:36 +02:00
stefawolf	23749cfc91	adding spans to doc_annotation in Example.to_dict (#11261 ) * adding spans to doc_annotation in Example.to_dict * to_dict compatible with from_dict: tuples instead of spans * use strings for label and kb_id * Simplify test * Update data formats docs Co-authored-by: Stefanie Wolf <stefanie.wolf@vitecsoftware.com> Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2022-08-05 12:26:38 +02:00
Adriane Boyd	b07708d5d0	Support full prerelease versions in the compat table (#11228 ) * Support full prerelease versions in the compat table * Fix types	2022-08-04 15:14:19 +02:00
Jules Belveze	cd09614ab2	chore: add 'concepCy' to spacy universe (#11255 ) * chore: add 'concepCy' to spacy universe * docs: add 'slogan' to concepCy	2022-08-04 15:42:38 +09:00
Lj Miranda	d993df41e5	Update docs for pipeline initialize() methods (#11221 ) * Update documentation for dependency parser * Update documentation for trainable_lemmatizer * Update documentation for entity_linker * Update documentation for ner * Update documentation for morphologizer * Update documentation for senter * Update documentation for spancat * Update documentation for tagger * Update documentation for textcat * Update documentation for tok2vec * Run prettier on edited files * Apply similar changes in transformer docs * Remove need to say annotated example explicitly I removed the need to say "Must contain at least one annotated Example" because it's often a given that Examples will contain some gold-standard annotation. * Run prettier on transformer docs	2022-08-03 16:53:02 +02:00
Adriane Boyd	d0578c2ede	Add scorer to textcat API docs config settings (#11263 )	2022-08-03 16:41:20 +02:00
Paul O'Leary McCann	2d89dd9db8	Update natto-py version spec (#11222 ) * Update natto-py version spec * Update setup.cfg Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2022-07-28 07:45:02 +02:00
ninjalu	95a1b8aca6	add additional REL_OP (#10371 ) * add additional REL_OP * change to condition and new rel_op symbols * add operators to docs * add the anchor while we're in here * add tests Co-authored-by: Peter Baumgartner <5107405+pmbaumgartner@users.noreply.github.com>	2022-07-27 13:16:44 +02:00
Madeesh Kannan	1829d7120a	`ExplosionBot`: Add note about case-sensitivity (#11211 )	2022-07-27 14:24:22 +09:00
Edward	360a702ecd	Add parent argument (#11210 )	2022-07-26 14:35:18 +02:00

1 2 3 4 5 ...

15606 Commits