spaCy

mirror of https://github.com/explosion/spaCy.git synced 2026-01-24 01:04:03 +03:00

Author	SHA1	Message	Date
mylibrar	d621df6422	Update example code of forte (#9175 ) Co-authored-by: Suqi Sun <suqi.sun@petuum.com>	2021-09-11 13:25:17 +09:00
Sofie Van Landeghem	721f4554c8	matcher doc corrections (#9115 ) * update error message to current UX * clarify uppercase effect * fix docstring	2021-09-02 09:29:44 +02:00
Paul O'Leary McCann	752696f134	Document Assigned Attributes of Pipeline Components (#9041 ) * Add textcat docs * Add NER docs * Add Entity Linker docs * Add assigned fields docs for the tagger This also adds a preamble, since there wasn't one. * Add morphologizer docs * Add dependency parser docs * Update entityrecognizer docs This is a little weird because `Doc.ents` is the only thing assigned to, but it's actually a bidirectional property. * Add token fields for entityrecognizer * Fix section name * Add entity ruler docs * Add lemmatizer docs * Add sentencizer/recognizer docs * Update website/docs/api/entityrecognizer.md Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update website/docs/api/entityruler.md Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update website/docs/api/tagger.md Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update website/docs/api/entityruler.md Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update type for Doc.ents This was `Tuple[Span, ...]` everywhere but `Tuple[Span]` seems to be correct. * Run prettier * Apply suggestions from code review Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Run prettier * Add transformers section This basically just moves and renames the "custom attributes" section from the bottom of the page to be consistent with "assigned attributes" on other pages. I looked at moving the paragraph just above the section into the section, but it includes the unrelated registry additions, so it seemed better to leave it unchanged. * Make table header consistent Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-09-01 12:10:52 +02:00
svlandeg	3f16c45281	Merge branch 'spacy.io' of https://github.com/explosion/spaCy into spacy.io	2021-08-31 10:58:40 +02:00
Davide Fiocco	5c88998b9d	Fix point typo on docbin docs (#9097 )	2021-08-31 10:58:31 +02:00
Ines Montani	753149bc88	Update references to contributor agreement [ci skip]	2021-08-31 10:58:22 +02:00
Meenal Jhajharia	db42ba5240	benepar usage example has deprecated imports	2021-08-29 14:44:18 +09:00
Sofie Van Landeghem	689535c264	config is not Optional (#9024 )	2021-08-27 11:53:54 +02:00
Sofie Van Landeghem	8c1d86ea92	Document use-case of freezing tok2vec (#8992 ) * update error msg * add sentence to docs * expand note on frozen components	2021-08-26 09:53:29 +02:00
Sofie Van Landeghem	31c0a75e6d	fix docs for Span constructor arguments (#9023 )	2021-08-26 09:52:59 +02:00
svlandeg	fb8c2f794a	Merge remote-tracking branch 'upstream/master' into spacy.io	2021-08-20 14:49:51 +02:00
Sofie Van Landeghem	e1f88de729	bump to 3.1.2 (#9008 )	2021-08-20 12:41:09 +02:00
Sofie Van Landeghem	4d52d7051c	Fix spancat training on nested entities (#9007 ) * overfitting test on non-overlapping entities * add failing overfitting test for overlapping entities * failing test for list comprehension * remove test that was put in separate PR * bugfix * cleanup	2021-08-20 12:37:50 +02:00
Paul O'Leary McCann	9cc3dc2b67	Add glossary entry for _SP (#8983 )	2021-08-20 12:04:02 +02:00
Sofie Van Landeghem	de025beb5f	Warn and document spangroup.doc weakref (#8980 ) * test for error after Doc has been garbage collected * warn about using a SpanGroup when the Doc has been garbage collected * add warning to the docs * rephrase slightly * raise error instead of warning * update * move warning to doc property	2021-08-20 11:06:19 +02:00
Paul O'Leary McCann	0e4da8ed70	Fix type annotation in docs	2021-08-20 15:35:41 +09:00
Paul O'Leary McCann	37fe847af4	Fix type annotation in docs	2021-08-20 15:34:22 +09:00
Ines Montani	8444aa75e2	Fix universe.json [ci skip]	2021-08-20 11:26:46 +10:00
Ines Montani	f2b61b77a5	Fix universe.json [ci skip]	2021-08-20 11:26:29 +10:00
Ines Montani	f2d19e6dc2	Merge pull request #9003 from bbieniek/add-spacy-api-v3 [ci skip]	2021-08-20 11:23:50 +10:00
Ines Montani	894e16f5ca	Merge pull request #9003 from bbieniek/add-spacy-api-v3 [ci skip]	2021-08-20 11:23:30 +10:00
Baltazar	4d85cb88a5	added contribution license	2021-08-19 21:45:18 +02:00
Baltazar	71e65fe943	added spacy api v3 docker	2021-08-19 21:29:25 +02:00
Adriane Boyd	6722dc3dc5	Fix allow_overlap default for spancat scoring (#8970 ) * Remove irrelevant default options	2021-08-18 09:56:56 +02:00
Steele Farnsworth	b18cb1cd2a	Refactor dependencymatcher.pyx to use list comps and enumerate. (#8956 ) * Refactor to use list comps and enumerate. Replace loops that append to a list with a list comprehensions where this does not change the behavior; replace range(len(...)) loops with enumerate. Correct one typo in a comment. Replace a call to set() with a set literal. * Undo double assignment. Expand `tokens_to_key[j] = k = self._get_matcher_key(key, i, j)` to two statements. Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Sign contributors agreement Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-08-18 09:55:45 +02:00
Ines Montani	d94ddd5686	Auto-detect package dependencies in spacy package (#8948 ) * Auto-detect package dependencies in spacy package * Add simple get_third_party_dependencies test * Import packages_distributions explicitly * Inline packages_distributions * Fix docstring [ci skip] * Relax catalogue requirement * Move importlib_metadata to spacy.compat with note * Include license information [ci skip]	2021-08-17 14:05:13 +02:00
Sofie Van Landeghem	0a6b68848f	Fix making span_group (#8975 ) * fix _make_span_group * fix imports	2021-08-17 10:36:34 +02:00
Ines Montani	593a22cf2d	Add development docs for Language and code conventions (#8745 ) * WIP: add dev docs for Language / config [ci skip] * Add section on initialization [ci skip] * Fix wording [ci skip] * Add code conventions WIP [ci skip] * Update code convention docs [ci skip] * Update contributing guide and conventions [ci skip] * Update Code Conventions.md [ci skip] * Clarify sourced components + vectors * Apply suggestions from code review [ci skip] Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update wording and add link [ci skip] * restructure slightly + extended index * remove paragraph that breaks flow and is repeated in more detail later * fix anchors Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>	2021-08-17 09:38:15 +02:00
Paul O'Leary McCann	4ed5d9ad5a	Add notes on preparing training data to docs (#8964 ) * Add training data section Not entirely sure this is in the right location on the page - maybe it should be after quickstart? * Add pointer from binary format to training data section * Minor cleanup * Add to ToC, fix filename * Update website/docs/usage/training.md Co-authored-by: Ines Montani <ines@ines.io> * Update website/docs/usage/training.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/usage/training.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Move the training data section further down the page * Update website/docs/usage/training.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/usage/training.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Run prettier Co-authored-by: Ines Montani <ines@ines.io> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-08-16 17:39:19 +02:00
Paul O'Leary McCann	9391998c77	Add notes on preparing training data to docs (#8964 ) * Add training data section Not entirely sure this is in the right location on the page - maybe it should be after quickstart? * Add pointer from binary format to training data section * Minor cleanup * Add to ToC, fix filename * Update website/docs/usage/training.md Co-authored-by: Ines Montani <ines@ines.io> * Update website/docs/usage/training.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/usage/training.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Move the training data section further down the page * Update website/docs/usage/training.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/usage/training.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Run prettier Co-authored-by: Ines Montani <ines@ines.io> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-08-16 17:37:21 +02:00
Ines Montani	d65e03adae	Merge pull request #8951 from HLasse/master	2021-08-16 11:41:53 +10:00
Ines Montani	a894fe0440	Merge pull request #8951 from HLasse/master	2021-08-16 11:41:32 +10:00
Lasse	839ea0f987	change tags formatting to match	2021-08-13 14:40:08 +02:00
Lasse	70ab596f61	Merge branch 'master' of https://github.com/HLasse/spaCy	2021-08-13 14:35:21 +02:00
Lasse	195e4e48c3	add textdescriptives to universe	2021-08-13 14:35:18 +02:00
github-actions[bot]	92071326d8	Auto-format code with black (#8950 ) Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>	2021-08-13 11:48:38 +02:00
Adriane Boyd	8448c7dbc5	Update da trf recommendation (#8921 ) Update the da trf recommendation to the same model used in the pretrained pipelines.	2021-08-12 13:54:02 +02:00
Ines Montani	647abe186c	Merge pull request #8938 from explosion/docs/prodigy-v1-11-project [ci skip] Update Prodigy project template for v1.11	2021-08-12 21:17:14 +10:00
Ines Montani	6260f044cc	Merge pull request #8938 from explosion/docs/prodigy-v1-11-project [ci skip] Update Prodigy project template for v1.11	2021-08-12 21:16:49 +10:00
Ines Montani	4f769ff913	Update Prodigy project template for v1.11 [ci skip]	2021-08-12 13:46:20 +10:00
Paul O'Leary McCann	e227d24d43	Allow passing in array vars for speedup (#8882 ) * Allow passing in array vars for speedup This fixes #8845. Not sure about the docstring changes here... * Update docs Types maybe need more detail? Maybe not? * Run prettier on docs * Update spacy/tokens/span.pyx Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-08-10 15:13:53 +02:00
Paul O'Leary McCann	6029cfc391	Add scores to output in spancat (#8855 ) * Add scores to output in spancat This exposes the scores as an attribute on the SpanGroup. Includes a basic test. * Add basic doc note * Vectorize score calcs * Add "annotation format" section * Update website/docs/api/spancategorizer.md Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Clean up doc section * Ran prettier on docs * Get arrays off the gpu before iterating over them * Remove int() calls Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2021-08-10 13:47:49 +02:00
Ines Montani	c581848cbb	Merge pull request #8910 from DuyguA/patch-1 [ci skip] updated unv json for new book	2021-08-09 23:13:17 +10:00
Ines Montani	a1e9f19460	Merge pull request #8910 from DuyguA/patch-1 [ci skip] updated unv json for new book	2021-08-09 23:12:50 +10:00
Paul O'Leary McCann	35255786a1	Fix #8902 (bad link in docs) typo fix	2021-08-09 13:59:59 +02:00
Duygu Altinok	380b2817cf	updated unv json for new book	2021-08-09 12:39:22 +02:00
Paul O'Leary McCann	cac298471f	Fix #8902 (bad link in docs) typo fix	2021-08-08 22:04:00 +09:00
Eduard Zorita	439f30faad	Add stub files for main cython classes (#8427 ) * Add stub files for main API classes * Add contributor agreement for ezorita * Update types for ndarray and hash() * Fix __getitem__ and __iter__ * Add attributes of Doc and Token classes * Overload type hints for Span.__getitem__ * Fix type hint overload for Span.__getitem__ Co-authored-by: Luca Dorigo <dorigoluca@gmail.com>	2021-08-07 12:30:03 +02:00
github-actions[bot]	56d4d87aeb	Auto-format code with black (#8895 ) Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>	2021-08-06 13:38:06 +02:00
Kabir Khan	1dfffe5fb4	No output info message in train (#8885 ) * Add info message that no output directory was provided in train * Update train.py * Fix logging	2021-08-05 09:21:22 +02:00

1 2 3 4 5 ...

14946 Commits