spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-11-11 05:19:52 +03:00

Author	SHA1	Message	Date
Adriane Boyd	084fc575aa	Set version to v3.0.0rc3	2020-11-03 17:29:57 +01:00
Adriane Boyd	1c4df8fd09	Replace pytokenizations with internal alignment (#6293 ) * Replace pytokenizations with internal alignment Replace pytokenizations with internal alignment algorithm that is restricted to only allow differences in whitespace and capitalization. * Rename `spacy.training.align` to `spacy.training.alignment` to contain the `Alignment` dataclass * Implement `get_alignments` in `spacy.training.align` * Refactor trailing whitespace handling * Remove unnecessary exception for empty docs Allow a non-empty whitespace-only doc to be aligned with an empty doc * Remove empty docs exceptions completely	2020-11-03 16:24:38 +01:00
Adriane Boyd	a4b32b9552	Handle missing reference values in scorer (#6286 ) * Handle missing reference values in scorer Handle missing values in reference doc during scoring where it is possible to detect an unset state for the attribute. If no reference docs contain annotation, `None` is returned instead of a score. `spacy evaluate` displays `-` for missing scores and the missing scores are saved as `None`/`null` in the metrics. Attributes without unset states: * `token.head`: relies on `token.dep` to recognize unset values * `doc.cats`: unable to handle missing annotation Additional changes: * add optional `has_annotation` check to `score_scans` to replace `doc.sents` hack * update `score_token_attr_per_feat` to handle missing and empty morph representations * fix bug in `Doc.has_annotation` for normalization of `IS_SENT_START` vs. `SENT_START` * Fix import * Update return types	2020-11-03 15:47:18 +01:00
Adriane Boyd	5d2cb86c34	Fix on_match callback for DependencyMatcher (#6313 ) Fix `DependencyMatcher` so that the callback is called only once per match.	2020-10-31 12:20:27 +01:00
Sofie Van Landeghem	2918923541	fix resolving of dot notation (#6326 )	2020-10-31 12:17:06 +01:00
Adriane Boyd	dc816bba9d	Fix node name typo in dependency matcher example (#6311 )	2020-10-28 16:32:46 +01:00
Sofie Van Landeghem	ace6ae435b	set pydantic upper pin to 1.7 for now (#6308 )	2020-10-26 23:31:08 +01:00
Ines Montani	2c9804038d	Fix success message [ci skip]	2020-10-23 16:11:54 +02:00
Ines Montani	270c836bd6	Merge pull request #6276 from adrianeboyd/chore/add-jinja2	2020-10-20 10:05:53 +02:00
Ines Montani	6523f2daac	Merge pull request #6273 from adrianeboyd/bugfix/detailed-scores-in-evaluate2	2020-10-20 10:03:09 +02:00
Adriane Boyd	3629296757	Fix requirements, remove version pins	2020-10-19 19:04:42 +02:00
Adriane Boyd	56077e7e64	Add dependency for jinja2	2020-10-19 18:58:15 +02:00
Adriane Boyd	fbe65b257b	Convert accuracy numbers on website models page	2020-10-19 18:55:55 +02:00
Ines Montani	b6b1c1e23c	Merge pull request #6271 from walterhenry/develop-proof [ci skip]	2020-10-19 16:31:43 +02:00
Adriane Boyd	563a21834e	Save raw scores in evaluate output	2020-10-19 15:49:09 +02:00
Adriane Boyd	dd207ca6d0	Add dep_las_per_type and more generic PRF printer	2020-10-19 15:49:02 +02:00
Adriane Boyd	4300858ecb	Include per-type/feat scores in evaluate output	2020-10-19 15:48:55 +02:00
walterhenry	db24dc5614	Proofread remarks I think these may the last remarks for the nightly docs. Only two minor things actually.	2020-10-19 11:11:32 +02:00
Sofie Van Landeghem	75a202ce65	TextCat updates and fixes (#6263 ) * small fix in example imports * throw error when train_corpus or dev_corpus is not a string * small fix in custom logger example * limit macro_auc to labels with 2 annotations * fix typo * also create parents of output_dir if need be * update documentation of textcat scores * refactor TextCatEnsemble * fix tests for new AUC definition * bump to 3.0.0a42 * update docs * rename to spacy.TextCatEnsemble.v2 * spacy.TextCatEnsemble.v1 in legacy * cleanup * small fix * update to 3.0.0rc2 * fix import that got lost in merge * cursed IDE * fix two typos	2020-10-18 14:50:41 +02:00
Ines Montani	e2f3c4e12d	Fix robots [ci skip]	2020-10-16 17:44:13 +02:00
Ines Montani	a9d2293661	Merge pull request #6264 from adrianeboyd/docs/license-links [ci skip]	2020-10-16 17:05:11 +02:00
Adriane Boyd	e896803792	Add and update website license links	2020-10-16 17:01:52 +02:00
Ines Montani	c655742b8b	Remove docs references to starters for now (see #6262 ) [ci skip]	2020-10-16 15:46:34 +02:00
Ines Montani	5a6ed01ce0	Merge pull request #6262 from adrianeboyd/bugfix/template-en-vectors	2020-10-16 15:38:08 +02:00
Ines Montani	7904285991	Merge pull request #6259 from jmargeta/fix-empty-list-validation	2020-10-16 15:35:32 +02:00
Ines Montani	c968d1560f	Fix docs example [ci skip]	2020-10-16 11:33:20 +02:00
Adriane Boyd	c8d04b79e2	Sort and add vectors for langs without transformers	2020-10-16 08:25:16 +02:00
Adriane Boyd	2fbd43c603	Use core lg models as vectors models in quickstart	2020-10-16 08:17:53 +02:00
Jan Margeta	1ad2213349	Fix TokenPatternSchema pattern field validation Empty pattern field should be considered invalid This is fixed by replacing minItems with min_items as described in Pydantic docs: https://pydantic-docs.helpmanual.io/usage/schema/	2020-10-16 00:41:21 +02:00
Jan Margeta	ed1c37189a	Add contributor agreement for jmargeta	2020-10-16 00:38:42 +02:00
Ines Montani	ba1e004049	Fix typo [ci skip]	2020-10-15 23:39:04 +02:00
Ines Montani	32dc4f4796	Sort models sidebar alphabetically [ci skip]	2020-10-15 22:47:16 +02:00
Ines Montani	20f80587d6	Merge pull request #6257 from walterhenry/develop-proof A few tiny typo fixes to push through with release of nightly	2020-10-15 18:17:30 +02:00
walterhenry	75b7f86383	Three small typos Some little typos since v3.0 is out.	2020-10-15 18:06:37 +02:00
Ines Montani	09dbbe75d7	Update docs [ci skip]	2020-10-15 17:27:24 +02:00
Ines Montani	ff4267d181	Fix success message [ci skip]	2020-10-15 14:42:08 +02:00
Ines Montani	10611bf56a	Increment version [ci skip]	2020-10-15 13:30:11 +02:00
Ines Montani	7f05ccc170	Update docs [ci skip]	2020-10-15 12:35:30 +02:00
Ines Montani	4fa869e6f7	Update docs [ci skip]	2020-10-15 11:16:06 +02:00
Ines Montani	4e17ddf75e	Merge pull request #6256 from adrianeboyd/bugfix/docs-to-json-raw	2020-10-15 10:35:01 +02:00
Ines Montani	b1d568a4df	Tidy up tests	2020-10-15 10:20:21 +02:00
Ines Montani	d165af26be	Auto-format [ci skip]	2020-10-15 10:08:53 +02:00
Ines Montani	db16059f9b	Merge pull request #6255 from explosion/master-tmp	2020-10-15 10:04:07 +02:00
Adriane Boyd	a93d42861d	Use null raw for has_unknown_spaces in docs_to_json	2020-10-15 09:57:54 +02:00
Ines Montani	5665a21517	Tidy up	2020-10-15 09:30:32 +02:00
Ines Montani	5d62499266	Fix tests	2020-10-15 09:29:15 +02:00
Ines Montani	178760855f	Merge branch 'develop' into master-tmp	2020-10-15 09:06:03 +02:00
Ines Montani	abeafcbc08	Update docs [ci skip]	2020-10-15 08:58:30 +02:00
Ines Montani	bc85b12e6d	Merge pull request #6249 from svlandeg/feature/batch-tests	2020-10-15 08:57:56 +02:00
Ines Montani	050aa1e0e2	Update languages.json [ci skip]	2020-10-14 20:51:50 +02:00

1 2 3 4 5 ...

13759 Commits