spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-10-04 02:46:40 +03:00

Author	SHA1	Message	Date
Paul O'Leary McCann	ba6a37d358	Document Assigned Attributes of Pipeline Components (#9041 ) * Add textcat docs * Add NER docs * Add Entity Linker docs * Add assigned fields docs for the tagger This also adds a preamble, since there wasn't one. * Add morphologizer docs * Add dependency parser docs * Update entityrecognizer docs This is a little weird because `Doc.ents` is the only thing assigned to, but it's actually a bidirectional property. * Add token fields for entityrecognizer * Fix section name * Add entity ruler docs * Add lemmatizer docs * Add sentencizer/recognizer docs * Update website/docs/api/entityrecognizer.md Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update website/docs/api/entityruler.md Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update website/docs/api/tagger.md Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update website/docs/api/entityruler.md Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update type for Doc.ents This was `Tuple[Span, ...]` everywhere but `Tuple[Span]` seems to be correct. * Run prettier * Apply suggestions from code review Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Run prettier * Add transformers section This basically just moves and renames the "custom attributes" section from the bottom of the page to be consistent with "assigned attributes" on other pages. I looked at moving the paragraph just above the section into the section, but it includes the unrelated registry additions, so it seemed better to leave it unchanged. * Make table header consistent Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-09-01 12:09:39 +02:00
Paul O'Leary McCann	f803a84571	Fix inference of epoch_resume (#9084 ) * Fix inference of epoch_resume When an epoch_resume value is not specified individually, it can often be inferred from the filename. The value inference code was there but the value wasn't passed back to the training loop. This also adds a specific error in the case where no epoch_resume value is provided and it can't be inferred from the filename. * Add new error * Always use the epoch resume value if specified Before this the value in the filename was used if found	2021-09-01 14:17:42 +09:00
Sofie Van Landeghem	a17b06d18b	allow typer 0.4 (#9089 )	2021-08-31 20:53:51 +10:00
svlandeg	3f16c45281	Merge branch 'spacy.io' of https://github.com/explosion/spaCy into spacy.io	2021-08-31 10:58:40 +02:00
Davide Fiocco	5c88998b9d	Fix point typo on docbin docs (#9097 )	2021-08-31 10:58:31 +02:00
Ines Montani	753149bc88	Update references to contributor agreement [ci skip]	2021-08-31 10:58:22 +02:00
Davide Fiocco	1dd69be1f1	Fix point typo on docbin docs (#9097 )	2021-08-31 10:55:44 +02:00
Ines Montani	1a86d545af	Update references to contributor agreement [ci skip]	2021-08-31 10:03:38 +10:00
Sofie Van Landeghem	5af88427a2	Dev docs: listeners (#9061 ) * Start Listeners documentation * intro tabel of different architectures * initialization, linking, dim inference * internal comm (WIP) * expand internal comm section * frozen components and replacing listeners * various small fixes * fix content table * fix link	2021-08-30 14:56:35 +02:00
Adriane Boyd	1e9b4b55ee	Pass overrides to subcommands in workflows (#9059 ) * Pass overrides to subcommands in workflows * Add missing docstring	2021-08-30 09:23:54 +02:00
Meenal Jhajharia	db42ba5240	benepar usage example has deprecated imports	2021-08-29 14:44:18 +09:00
Paul O'Leary McCann	6ff8d90070	Merge pull request #9081 from mjhajharia/patch-1 benepar usage example has deprecated imports	2021-08-29 14:41:52 +09:00
Meenal Jhajharia	2613f0e98f	benepar usage example has deprecated imports	2021-08-28 16:35:58 +05:30
Sofie Van Landeghem	689535c264	config is not Optional (#9024 )	2021-08-27 11:53:54 +02:00
Sofie Van Landeghem	1e974de837	config is not Optional (#9024 )	2021-08-27 11:44:31 +02:00
github-actions[bot]	fb9c31fbda	Auto-format code with black (#9065 ) Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>	2021-08-27 11:42:27 +02:00
Sofie Van Landeghem	8c1d86ea92	Document use-case of freezing tok2vec (#8992 ) * update error msg * add sentence to docs * expand note on frozen components	2021-08-26 09:53:29 +02:00
Sofie Van Landeghem	31c0a75e6d	fix docs for Span constructor arguments (#9023 )	2021-08-26 09:52:59 +02:00
Sofie Van Landeghem	4d39430b82	Document use-case of freezing tok2vec (#8992 ) * update error msg * add sentence to docs * expand note on frozen components	2021-08-26 09:50:35 +02:00
Sofie Van Landeghem	94fb840443	fix docs for Span constructor arguments (#9023 )	2021-08-25 16:06:22 +02:00
David Strouk	31e9b126a0	Fix verbs list in lang/fr/tokenizer_exceptions.py (#9033 )	2021-08-25 15:55:09 +02:00
Ines Montani	4cd052e81d	Include component factories in third-party dependencies resolver (#9009 ) * Include component factories in third-party dependencies resolver * Increment catalogue and update test	2021-08-25 14:58:01 +02:00
svlandeg	fb8c2f794a	Merge remote-tracking branch 'upstream/master' into spacy.io	2021-08-20 14:49:51 +02:00
Sofie Van Landeghem	e1f88de729	bump to 3.1.2 (#9008 )	2021-08-20 12:41:09 +02:00
Sofie Van Landeghem	4d52d7051c	Fix spancat training on nested entities (#9007 ) * overfitting test on non-overlapping entities * add failing overfitting test for overlapping entities * failing test for list comprehension * remove test that was put in separate PR * bugfix * cleanup	2021-08-20 12:37:50 +02:00
Paul O'Leary McCann	9cc3dc2b67	Add glossary entry for _SP (#8983 )	2021-08-20 12:04:02 +02:00
Sofie Van Landeghem	de025beb5f	Warn and document spangroup.doc weakref (#8980 ) * test for error after Doc has been garbage collected * warn about using a SpanGroup when the Doc has been garbage collected * add warning to the docs * rephrase slightly * raise error instead of warning * update * move warning to doc property	2021-08-20 11:06:19 +02:00
Paul O'Leary McCann	0e4da8ed70	Fix type annotation in docs	2021-08-20 15:35:41 +09:00
Paul O'Leary McCann	37fe847af4	Fix type annotation in docs	2021-08-20 15:34:22 +09:00
Ines Montani	8444aa75e2	Fix universe.json [ci skip]	2021-08-20 11:26:46 +10:00
Ines Montani	f2b61b77a5	Fix universe.json [ci skip]	2021-08-20 11:26:29 +10:00
Ines Montani	f2d19e6dc2	Merge pull request #9003 from bbieniek/add-spacy-api-v3 [ci skip]	2021-08-20 11:23:50 +10:00
Ines Montani	894e16f5ca	Merge pull request #9003 from bbieniek/add-spacy-api-v3 [ci skip]	2021-08-20 11:23:30 +10:00
Baltazar	4d85cb88a5	added contribution license	2021-08-19 21:45:18 +02:00
Baltazar	71e65fe943	added spacy api v3 docker	2021-08-19 21:29:25 +02:00
Adriane Boyd	c5de9b463a	Update custom tokenizer APIs and pickling (#8972 ) * Fix incorrect pickling of Japanese and Korean pipelines, which led to the entire pipeline being reset if pickled * Enable pickling of Vietnamese tokenizer * Update tokenizer APIs for Chinese, Japanese, Korean, Thai, and Vietnamese so that only the `Vocab` is required for initialization	2021-08-19 14:37:47 +02:00
Adriane Boyd	6722dc3dc5	Fix allow_overlap default for spancat scoring (#8970 ) * Remove irrelevant default options	2021-08-18 09:56:56 +02:00
Steele Farnsworth	b18cb1cd2a	Refactor dependencymatcher.pyx to use list comps and enumerate. (#8956 ) * Refactor to use list comps and enumerate. Replace loops that append to a list with a list comprehensions where this does not change the behavior; replace range(len(...)) loops with enumerate. Correct one typo in a comment. Replace a call to set() with a set literal. * Undo double assignment. Expand `tokens_to_key[j] = k = self._get_matcher_key(key, i, j)` to two statements. Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Sign contributors agreement Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-08-18 09:55:45 +02:00
Ines Montani	d94ddd5686	Auto-detect package dependencies in spacy package (#8948 ) * Auto-detect package dependencies in spacy package * Add simple get_third_party_dependencies test * Import packages_distributions explicitly * Inline packages_distributions * Fix docstring [ci skip] * Relax catalogue requirement * Move importlib_metadata to spacy.compat with note * Include license information [ci skip]	2021-08-17 14:05:13 +02:00
Sofie Van Landeghem	0a6b68848f	Fix making span_group (#8975 ) * fix _make_span_group * fix imports	2021-08-17 10:36:34 +02:00
Ines Montani	593a22cf2d	Add development docs for Language and code conventions (#8745 ) * WIP: add dev docs for Language / config [ci skip] * Add section on initialization [ci skip] * Fix wording [ci skip] * Add code conventions WIP [ci skip] * Update code convention docs [ci skip] * Update contributing guide and conventions [ci skip] * Update Code Conventions.md [ci skip] * Clarify sourced components + vectors * Apply suggestions from code review [ci skip] Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update wording and add link [ci skip] * restructure slightly + extended index * remove paragraph that breaks flow and is repeated in more detail later * fix anchors Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>	2021-08-17 09:38:15 +02:00
Paul O'Leary McCann	4ed5d9ad5a	Add notes on preparing training data to docs (#8964 ) * Add training data section Not entirely sure this is in the right location on the page - maybe it should be after quickstart? * Add pointer from binary format to training data section * Minor cleanup * Add to ToC, fix filename * Update website/docs/usage/training.md Co-authored-by: Ines Montani <ines@ines.io> * Update website/docs/usage/training.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/usage/training.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Move the training data section further down the page * Update website/docs/usage/training.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/usage/training.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Run prettier Co-authored-by: Ines Montani <ines@ines.io> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-08-16 17:39:19 +02:00
Paul O'Leary McCann	9391998c77	Add notes on preparing training data to docs (#8964 ) * Add training data section Not entirely sure this is in the right location on the page - maybe it should be after quickstart? * Add pointer from binary format to training data section * Minor cleanup * Add to ToC, fix filename * Update website/docs/usage/training.md Co-authored-by: Ines Montani <ines@ines.io> * Update website/docs/usage/training.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/usage/training.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Move the training data section further down the page * Update website/docs/usage/training.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/usage/training.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Run prettier Co-authored-by: Ines Montani <ines@ines.io> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-08-16 17:37:21 +02:00
Ines Montani	d65e03adae	Merge pull request #8951 from HLasse/master	2021-08-16 11:41:53 +10:00
Ines Montani	a894fe0440	Merge pull request #8951 from HLasse/master	2021-08-16 11:41:32 +10:00
Lasse	839ea0f987	change tags formatting to match	2021-08-13 14:40:08 +02:00
Lasse	70ab596f61	Merge branch 'master' of https://github.com/HLasse/spaCy	2021-08-13 14:35:21 +02:00
Lasse	195e4e48c3	add textdescriptives to universe	2021-08-13 14:35:18 +02:00
github-actions[bot]	92071326d8	Auto-format code with black (#8950 ) Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>	2021-08-13 11:48:38 +02:00
Adriane Boyd	8448c7dbc5	Update da trf recommendation (#8921 ) Update the da trf recommendation to the same model used in the pretrained pipelines.	2021-08-12 13:54:02 +02:00

... 17 18 19 20 21 ...

15868 Commits