spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-08-08 14:14:57 +03:00

Author	SHA1	Message	Date
Kádár Ákos	e4b4b67ef6	handle empty clusters	2022-03-28 11:29:00 +02:00
Kádár Ákos	4fc40340f9	handle empty head_ids	2022-03-28 11:28:21 +02:00
Kádár Ákos	7304604edd	make sure predicted and reference keeps aligned	2022-03-25 18:29:33 +01:00
Kádár Ákos	83ac0477c8	remove useless extra prefix and device from spanpredictor	2022-03-24 16:44:50 +01:00
Kádár Ákos	1c5dabcb47	merge SpanPredictor attributes	2022-03-24 16:23:12 +01:00
Kádár Ákos	a872c69ffb	merge	2022-03-24 16:10:04 +01:00
Kádár Ákos	706b2e6f25	gearing up SpanPredictor for gold-heads	2022-03-24 16:06:20 +01:00
Kádár Ákos	150e7c46d7	conflict	2022-03-23 11:27:02 +01:00
Kádár Ákos	1eaf8fb0cf	span predictor debug start	2022-03-23 11:24:27 +01:00
Paul O'Leary McCann	eec00ce60d	Fix various sizes in SpanPredictor FFNN	2022-03-23 16:20:31 +09:00
Paul O'Leary McCann	2190cbc0e6	Add progress on SpanPredictor component This isn't working. There is a CUDA error in the torch code during initialization and it's not clear why.	2022-03-19 19:39:49 +09:00
Kádár Ákos	db422abf01	remove unnecessary .device	2022-03-18 16:24:26 +01:00
Paul O'Leary McCann	a098849112	Add fake batching The way fake batching works is that the pipeline component calls the model repeatedly in a loop internally. It feels like this should break something, but it worked in testing. Another issue is that this changes the signature of some of the pipeline functions, though I don't think that's an issue. Tested with batch size of 2, so more testing is needed, but this is a start.	2022-03-18 19:46:58 +09:00
Paul O'Leary McCann	1a79d18796	Formatting	2022-03-16 20:10:47 +09:00
Paul O'Leary McCann	6855df0e66	Skeleton for span predictor component This should be moved into its own file, but for now just stubbing out the methods.	2022-03-16 20:09:33 +09:00
Paul O'Leary McCann	0275ae29de	Remove stale comment	2022-03-16 20:09:12 +09:00
Paul O'Leary McCann	6974f55daa	Hack for transformer listener size	2022-03-16 15:15:53 +09:00
Paul O'Leary McCann	7811a1194b	Change architecture	2022-03-16 14:57:15 +09:00
Paul O'Leary McCann	5650853c0f	Remove unused functions	2022-03-16 14:38:11 +09:00
Paul O'Leary McCann	d0ae2590db	Delete all the coref-hoi code	2022-03-15 20:05:24 +09:00
Paul O'Leary McCann	abdc7d87af	Clean up util code Moved everything into coref_util.py, deleted wl-specific file.	2022-03-15 19:59:44 +09:00
Paul O'Leary McCann	55039a66ad	Remove old default config	2022-03-15 19:53:09 +09:00
Paul O'Leary McCann	17d017a177	Remove span2head This doesn't work as a component because it needs to modify gold data, so instead it's a conversion script (in another repo).	2022-03-15 19:52:20 +09:00
Paul O'Leary McCann	0522a43116	Make span2head component	2022-03-15 19:19:15 +09:00
Paul O'Leary McCann	e6917d8dc4	Add util functions for wl-coref	2022-03-14 19:27:55 +09:00
Paul O'Leary McCann	dfec6993d6	Training works now	2022-03-14 19:27:23 +09:00
Paul O'Leary McCann	8eadf3781b	Training runs now Evaluation needs fixing, and code still needs cleanup.	2022-03-14 19:02:17 +09:00
Paul O'Leary McCann	d22a002641	Forward/backward pass works Evaluate does not work - predict hasn't been updated	2022-03-14 17:26:27 +09:00
Paul O'Leary McCann	c4f9c24738	The coref model is able to be loaded The span predictor component is initialized but not used at all now. Plan is to work on it after the word level clustering part is trainable end-to-end.	2022-03-09 19:31:11 +09:00
Paul O'Leary McCann	35cc2b138f	Add span predictor code Accidentally omitted before	2022-03-08 18:13:26 +09:00
Paul O'Leary McCann	1c697b4011	Remove references to config Replaced with model arguments	2022-03-08 18:13:09 +09:00
Paul O'Leary McCann	c0cd5025e3	Start bringin in wl-coref This absolutely does not work. First step here is getting over most of the code in roughly the files we want it in. After the code has been pulled over it can be restructured to match spaCy and cleaned up.	2022-03-06 20:00:15 +09:00
svlandeg	0c15ab7ca1	remove irrelevant unit test (behaviour clarified by new error msgs around doc.spans)	2022-02-07 12:17:18 +01:00
Paul O'Leary McCann	c7f586c4ba	Merge branch 'master' into feature/coref This brings coref up to date, in particular giving access to 3.2 features.	2022-02-03 19:01:18 +09:00
Lj Miranda	345e7f6bc4	Clarify Span.ents documentation (#10154 ) * Clarify Span.ents documentation Ref: #10135 Retain current behaviour. Span.ents will only include entities within said span. You can't get tokens outside of the original span. * Reword docstrings Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update API docs in the website Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2022-01-31 08:41:42 +01:00
Marek Šuppa	f09c799a96	fix: Add missing comma to `_eleven_to_beyond` (#10166 ) * This comma has been most probably been left out unintentionally, leading to string concatenation between the two consecutive lines. This issue has been found automatically using a regular expression.	2022-01-30 16:45:06 +09:00
Marek Šuppa	67ecac633f	fix: Add missing comma to `examples.py` (#10167 ) * This comma has been most probably been left out unintentionally, leading to string concatenation between the two consecutive lines. This issue has been found automatically using a regular expression.	2022-01-30 16:43:29 +09:00
Adriane Boyd	4f441dfa24	Fix infix as prefix in Tokenizer.explain (#10140 ) * Fix infix as prefix in Tokenizer.explain Update `Tokenizer.explain` to align with the `Tokenizer` algorithm: * skip infix matches that are prefixes in the current substring * Update tokenizer pseudocode in docs	2022-01-28 17:00:54 +01:00
Eduard Zorita	30cf9d6a05	Update typing hints (#10109 ) * Improve typing hints for Matcher.__call__ * Add typing hints for DependencyMatcher * Add typing hints to underscore extensions * Update Doc.tensor type (requires numpy 1.21) * Fix typing hints for Language.component decorator * Use generic np.ndarray type in Doc to avoid numpy version update * Fix mypy errors * Fix cyclic import caused by Underscore typing hints * Use Literal type from spacy.compat * Update matcher.pyi import format Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2022-01-28 16:59:54 +01:00
Adriane Boyd	09734c56fc	Use simple suggester for spancat initialization (#10143 ) Instead of the running the actual suggester, which may require annotation from annotating components that is not necessarily present in the reference docs, use the built-in 1-gram suggester.	2022-01-28 09:34:23 +01:00
github-actions[bot]	6d4db5c3c7	Auto-format code with black (#10106 ) Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>	2022-01-21 10:01:10 +01:00
Ines Montani	34ed93ef68	Support version tags in universe and add note about reporting (#10093 ) * Support version tags in universe and add note about reporting * Apply suggestions from code review Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2022-01-20 23:21:26 +01:00
Peter Baumgartner	a69005037a	Docker Image for Website Dev (#10098 ) * add docker instructions * Update website/README.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/README.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * clarifying language on docker image * fix markdown formatting Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2022-01-20 23:02:13 +01:00
Duygu Altinok	47a2916801	Intify IOB (#9738 ) * added iob to int * added tests * added iob strings * added error * blacked attrs * Update spacy/tests/lang/test_attrs.py Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update spacy/attrs.pyx Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * added iob strings as global * minor refinement with iob * removed iob strings from token * changed to uppercase * cleaned and went back to master version * imported iob from attrs * Update and format errors * Support and test both str and int ENT_IOB key Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2022-01-20 13:19:38 +01:00
Duygu Altinok	268ddf8a06	Add ENT_IOB key to Matcher (#9649 ) * added new field * added exception for IOb strings * minor refinement to schema * removed field * fixed typo * imported numeriacla val * changed the code bit * cosmetics * added test for matcher * set ents of moc docs * added invalid pattern * minor update to documentation * blacked matcher * added pattern validation * add IOB vals to schema * changed into test * mypy compat * cleaned left over * added compat import * changed type * added compat import * changed literal a bit * went back to old * made explicit type * Update spacy/schemas.py Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update spacy/schemas.py Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update spacy/schemas.py Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2022-01-20 13:18:39 +01:00
Paul O'Leary McCann	32bd3856b3	Rename FACILITY to FAC in color list (#10067 ) This matches the English models	2022-01-20 12:00:28 +01:00
Adriane Boyd	a55212fca0	Determine labels by factory name in debug data (#10079 ) * Determine labels by factory name in debug data For all components, return labels for all components with the corresponding factory name rather than for only the default name. For `spancat`, return labels as a dict keyed by `spans_key`. * Refactor for typing * Add test * Use assert instead of cast, removed unneeded arg * Mark test as slow	2022-01-20 11:42:52 +01:00
Richard Hudson	e9c6314539	Bugfix for similarity return types (#10051 )	2022-01-20 11:40:46 +01:00
Adriane Boyd	7d528e607c	Update quickstart install steps (#10092 ) * For conda: * Use conda environment rather than venv * Install `spacy-transformers` as a conda package * For pip: * Add quotes if extras are included	2022-01-20 10:53:40 +01:00
Paul O'Leary McCann	2ff53834bb	Add link to pattern file info in EntityRuler.initialize docs (#10091 ) * Add link to pattern file info in EntityRuler.initialize docs * Update website/docs/api/entityruler.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2022-01-19 10:45:11 +01:00

1 2 3 4 5 ...

15360 Commits