spaCy

mirror of https://github.com/explosion/spaCy.git synced 2024-11-14 05:37:03 +03:00

Author	SHA1	Message	Date
Sofie Van Landeghem	837a4f53c2	Error handling in nlp.pipe (#6817 ) * add error handler for pipe methods * add unit tests * remove pipe method that are the same as their base class * have Language keep track of a default error handler * cleanup * formatting * small refactor * add documentation	2021-01-29 08:51:21 +08:00
Adriane Boyd	80ac8af1bf	Format	2020-12-09 12:44:01 +01:00
Adriane Boyd	795b5bd049	Update website/docs/api/language.md Co-authored-by: Ines Montani <ines@ines.io>	2020-12-09 12:23:32 +01:00
Adriane Boyd	fa8fa474a3	Add nlp.batch_size setting Add a default `batch_size` setting for `Language.pipe` and `Language.evaluate` as `nlp.batch_size`.	2020-12-09 09:13:26 +01:00
Ines Montani	363ac73c72	Update docs [ci skip]	2020-11-09 12:43:26 +08:00
svlandeg	eaf5c265cb	set_kb method for entity_linker	2020-10-08 10:34:01 +02:00
Ines Montani	df06f7a792	Update docs [ci skip]	2020-10-02 13:24:33 +02:00
Ines Montani	f2627157c8	Update docs [ci skip]	2020-10-01 17:38:17 +02:00
Ines Montani	d3c63b7965	Merge branch 'develop' into feature/prepare	2020-09-29 20:53:05 +02:00
Ines Montani	d7469283c5	Update docs [ci skip]	2020-09-29 16:59:21 +02:00
Ines Montani	ff9a63bfbd	begin_training -> initialize	2020-09-28 21:35:09 +02:00
Ines Montani	b92c8aae78	Merge branch 'develop' into pr/6135	2020-09-24 13:44:56 +02:00
walterhenry	3dd5f409ec	Proofreading Proofread some API docs	2020-09-24 13:15:28 +02:00
Ines Montani	ae51f580c1	Fix handling of score_weights	2020-09-24 10:27:33 +02:00
Ines Montani	f9af7d365c	Update docs [ci skip]	2020-09-22 09:45:41 +02:00
Ines Montani	0edd695bf6	Update docs	2020-09-15 11:41:49 +02:00
Ines Montani	99549a5ace	Fix consistency and update docs	2020-09-15 11:37:37 +02:00
Ines Montani	8b0dabe987	Update docs [ci skip]	2020-09-12 17:05:10 +02:00
svlandeg	9073d99fc9	fix link to shape inference section	2020-09-10 10:22:59 +02:00
svlandeg	a8aa9a8068	document Pipe API details, crossreferences etc	2020-09-09 15:56:27 +02:00
Ines Montani	25a595dc10	Fix typos and wording [ci skip]	2020-09-03 16:37:45 +02:00
Ines Montani	b5a0657fd6	"model" terminology consistency in docs	2020-09-03 13:13:03 +02:00
Ines Montani	66d76f5126	Update docs	2020-08-29 12:36:05 +02:00
Ines Montani	82f0e20318	Update docs and consistency [ci skip]	2020-08-18 14:39:40 +02:00
Ines Montani	1c3bcfb488	Update docs and util consistency	2020-08-18 01:22:59 +02:00
Ines Montani	3ae5e02f4f	Update docs, types and API consistency	2020-08-17 16:45:24 +02:00
Ines Montani	b7ec06e331	Update docs [ci skip]	2020-08-11 20:57:23 +02:00
Ines Montani	c044460823	Update docs [ci skip]	2020-08-10 00:01:38 +02:00
Ines Montani	cdec46493f	Update docs	2020-08-05 15:00:54 +02:00
Ines Montani	b40f44419b	Simplify pipe analysis - remove unused code - don't print by default - integrate attrs info into analysis output	2020-08-01 13:40:06 +02:00
Ines Montani	98c6a85c8b	Update docs [ci skip]	2020-07-31 18:55:38 +02:00
Ines Montani	e9e8fa2466	Update docs and types	2020-07-31 17:02:54 +02:00
Adriane Boyd	9b509aa87f	Move Language.evaluate scorer config to new arg Move `Language.evaluate` scorer config from `component_cfg` to separate argument `scorer_cfg`.	2020-07-31 11:05:16 +02:00
Ines Montani	b0f57a0cac	Update docs and consistency	2020-07-29 15:14:07 +02:00
Ines Montani	e0ffe36e79	Update docstrings, docs and types	2020-07-29 11:36:42 +02:00
Ines Montani	ae4d8a6ffd	Update docstrings, docs and pipe consistency	2020-07-28 13:37:31 +02:00
Ines Montani	0094cb0d04	Remove scores list from config and document	2020-07-28 11:22:24 +02:00
Ines Montani	d8b519c23c	API docs, docstrings and argument consistency	2020-07-27 18:11:45 +02:00
Ines Montani	7adbaf9a5b	Update docs [ci skip]	2020-07-27 00:29:45 +02:00
Ines Montani	c288dba8e7	Update docs [ci skip]	2020-07-25 18:51:12 +02:00
Adriane Boyd	2bcceb80c4	Refactor the Scorer to improve flexibility (#5731 ) * Refactor the Scorer to improve flexibility Refactor the `Scorer` to improve flexibility for arbitrary pipeline components. * Individual pipeline components provide their own `evaluate` methods that score a list of `Example`s and return a dictionary of scores * `Scorer` is initialized either: * with a provided pipeline containing components to be scored * with a default pipeline containing the built-in statistical components (senter, tagger, morphologizer, parser, ner) * `Scorer.score` evaluates a list of `Example`s and returns a dictionary of scores referring to the scores provided by the components in the pipeline Significant differences: * `tags_acc` is renamed to `tag_acc` to be consistent with `token_acc` and the new `morph_acc`, `pos_acc`, and `lemma_acc` * Scoring is no longer cumulative: `Scorer.score` scores a list of examples rather than a single example and does not retain any state about previously scored examples * PRF values in the returned scores are no longer multiplied by 100 * Add kwargs to Morphologizer.evaluate * Create generalized scoring methods in Scorer * Generalized static scoring methods are added to `Scorer` * Methods require an attribute (either on Token or Doc) that is used to key the returned scores Naming differences: * `uas`, `las`, and `las_per_type` in the scores dict are renamed to `dep_uas`, `dep_las`, and `dep_las_per_type` Scoring differences: * `Doc.sents` is now scored as spans rather than on sentence-initial token positions so that `Doc.sents` and `Doc.ents` can be scored with the same method (this lowers scores since a single incorrect sentence start results in two incorrect spans) * Simplify / extend hasattr check for eval method * Add hasattr check to tokenizer scoring * Simplify to hasattr check for component scoring * Reset Example alignment if docs are set Reset the Example alignment if either doc is set in case the tokenization has changed. * Add PRF tokenization scoring for tokens as spans Add PRF scores for tokens as character spans. The scores are: * token_acc: # correct tokens / # gold tokens * token_p/r/f: PRF for (token.idx, token.idx + len(token)) * Add docstring to Scorer.score_tokenization * Rename component.evaluate() to component.score() * Update Scorer API docs * Update scoring for positive_label in textcat * Fix TextCategorizer.score kwargs * Update Language.evaluate docs * Update score names in default config	2020-07-25 12:53:02 +02:00
svlandeg	c94279ac1b	remove tensors, fix predict, get_loss and set_annotations	2020-07-08 13:11:54 +02:00
svlandeg	90b100c39f	remove component.Model, update constructor, losses is return value of update	2020-07-08 12:14:30 +02:00
svlandeg	2b60e894cb	fix component constructors, update, begin_training, reference to GoldParse	2020-07-07 19:17:19 +02:00
Ines Montani	4498dfe99d	Update docs	2020-07-04 16:25:30 +02:00
Ines Montani	a4cfe9fc33	Remove inline notes on v2 changes [ci skip]	2020-07-01 22:29:22 +02:00
Ines Montani	fe4cfd0632	Start updating website for v3 [ci skip]	2020-07-01 21:26:39 +02:00
Ines Montani	810fce3bb1	Merge branch 'develop' into master-tmp	2020-06-03 14:36:59 +02:00
Ines Montani	262d306eaa	unicode -> str consistency	2020-05-24 17:23:00 +02:00
Ines Montani	24f72c669c	Merge branch 'develop' into master-tmp	2020-05-21 18:39:06 +02:00

1 2

66 Commits