spaCy

mirror of https://github.com/explosion/spaCy.git synced 2026-03-06 04:41:32 +03:00

History

Patrick J. Burns 5ae63b1fbd Add Latin language support (#11349 ) * Add lang folder for la (Latin) * Add Latin lang classes * Add minimal tokenizer exceptions * Add minimal stopwords * Add minimal lex_attrs * Update stopwords, tokenizer exceptions * Add la tests; register la_tokenizer in conftest.py * Update spacy/lang/la/lex_attrs.py Remove duplicate form in Latin lex_attrs Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update natto-py version spec (#11222) * Update natto-py version spec * Update setup.cfg Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Add scorer to textcat API docs config settings (#11263) * Update docs for pipeline initialize() methods (#11221) * Update documentation for dependency parser * Update documentation for trainable_lemmatizer * Update documentation for entity_linker * Update documentation for ner * Update documentation for morphologizer * Update documentation for senter * Update documentation for spancat * Update documentation for tagger * Update documentation for textcat * Update documentation for tok2vec * Run prettier on edited files * Apply similar changes in transformer docs * Remove need to say annotated example explicitly I removed the need to say "Must contain at least one annotated Example" because it's often a given that Examples will contain some gold-standard annotation. * Run prettier on transformer docs * chore: add 'concepCy' to spacy universe (#11255) * chore: add 'concepCy' to spacy universe * docs: add 'slogan' to concepCy * Support full prerelease versions in the compat table (#11228) * Support full prerelease versions in the compat table * Fix types * adding spans to doc_annotation in Example.to_dict (#11261) * adding spans to doc_annotation in Example.to_dict * to_dict compatible with from_dict: tuples instead of spans * use strings for label and kb_id * Simplify test * Update data formats docs Co-authored-by: Stefanie Wolf <stefanie.wolf@vitecsoftware.com> Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Fix regex invalid escape sequences (#11276) * Add W605 to the errors raised by flake8 in the CI (#11283) * Clean up automated label-based issue handling (#11284) * Clean up automated label-based issue handline 1. upgrade tiangolo/issue-manager to latest 2. move needs-more-info to tiangolo 3. change needs-more-info close time to 7 days 4. delete old needs-more-info config * Use old, longer message * Fix label name * Fix Dutch noun chunks to skip overlapping spans (#11275) * Add test for overlapping noun chunks * Skip overlapping noun chunks * Update spacy/tests/lang/nl/test_noun_chunks.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Docs: displaCy documentation - data types, `parse_{deps,ents,spans}`, spans example (#10950) * add in spans example and parse references * rm autoformatter * rm extra ents copy * TypedDict draft * type fixes * restore non-documentation files * docs update * fix spans example * fix hyperlinks * add parse example * example fix + argument fix * fix api arg in docs * fix bad variable replacement * fix spacing in style Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * fix spacing on table * fix spacing on table * rm temp files Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * include span_ruler for default warning filter (#11333) * Add uk pipelines to website (#11332) * Check for . in factory names (#11336) * Make fixes for PR #11349 * Fix roman numeral coverage in #11349 Co-authored-by: Patrick J. Burns <patricks@diyclassics.org> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com> Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> Co-authored-by: Lj Miranda <12949683+ljvmiranda921@users.noreply.github.com> Co-authored-by: Jules Belveze <32683010+JulesBelveze@users.noreply.github.com> Co-authored-by: stefawolf <wlf.ste@gmail.com> Co-authored-by: Stefanie Wolf <stefanie.wolf@vitecsoftware.com> Co-authored-by: Peter Baumgartner <5107405+pmbaumgartner@users.noreply.github.com>		2022-08-30 14:04:54 +02:00
..
architectures.md	Remove `simply` (#11017 )	2022-06-27 09:45:22 +02:00
attributeruler.md	Document scorers in registry and components from #8766 (#8929 )	2021-08-12 12:50:03 +02:00
attributes.md	Add API docs for token attribute symbols (#10836 )	2022-06-23 08:16:38 +02:00
cli.md	Remove NBSP's across tables in the docs (#10842 )	2022-05-25 09:48:39 +02:00
corpus.md	Remove NBSP's across tables in the docs (#10842 )	2022-05-25 09:48:39 +02:00
cython-classes.md	Update docs, types and API consistency	2020-08-17 16:45:24 +02:00
cython-structs.md	Update docs, types and API consistency	2020-08-17 16:45:24 +02:00
cython.md	Update docs [ci skip]	2020-09-12 17:05:10 +02:00
data-formats.md	adding spans to doc_annotation in Example.to_dict (#11261 )	2022-08-05 12:26:38 +02:00
dependencymatcher.md	add additional REL_OP (#10371 )	2022-07-27 13:16:44 +02:00
dependencyparser.md	Update docs for pipeline initialize() methods (#11221 )	2022-08-03 16:53:02 +02:00
doc.md	Add Doc.from_json() (#10688 )	2022-06-02 14:03:47 +02:00
docbin.md	Fix point typo on docbin docs (#9097 )	2021-08-31 10:55:44 +02:00
edittreelemmatizer.md	Update docs for pipeline initialize() methods (#11221 )	2022-08-03 16:53:02 +02:00
entitylinker.md	Update docs for pipeline initialize() methods (#11221 )	2022-08-03 16:53:02 +02:00
entityrecognizer.md	Update docs for pipeline initialize() methods (#11221 )	2022-08-03 16:53:02 +02:00
entityruler.md	Add SpanRuler component (#9880 )	2022-06-02 13:12:53 +02:00
example.md	Extend score_spans for overlapping & non-labeled spans (#7209 )	2021-04-08 12:19:17 +02:00
index.md	Update v3 docs	2020-07-03 16:48:21 +02:00
kb.md	Tidy up docs	2021-06-28 12:08:15 +02:00
language.md	Remove NBSP's across tables in the docs (#10842 )	2022-05-25 09:48:39 +02:00
legacy.md	Add ConsoleLogger.v2 (#11214 )	2022-08-29 10:23:05 +02:00
lemmatizer.md	Minor fix in Lemmatizer docs	2022-07-01 14:28:03 +09:00
lexeme.md	fix 's typo's across code base (#8384 )	2021-06-15 10:57:08 +02:00
lookups.md	Update docs, types and API consistency	2020-08-17 16:45:24 +02:00
matcher.md	fix docs (#11123 )	2022-07-24 17:16:36 +09:00
morphologizer.md	Update docs for pipeline initialize() methods (#11221 )	2022-08-03 16:53:02 +02:00
morphology.md	Document Assigned Attributes of Pipeline Components (#9041 )	2021-09-01 12:09:39 +02:00
phrasematcher.md	🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167 )	2021-10-14 15:21:40 +02:00
pipe.md	Document scorers in registry and components from #8766 (#8929 )	2021-08-12 12:50:03 +02:00
pipeline-functions.md	add doc cleaner to menu (#10862 )	2022-05-30 08:51:19 +02:00
scorer.md	Add micro PRF for morph scoring (#9546 )	2021-10-29 10:29:29 +02:00
sentencerecognizer.md	Update docs for pipeline initialize() methods (#11221 )	2022-08-03 16:53:02 +02:00
sentencizer.md	Update overwrite and scorer in API docs (#9384 )	2021-10-11 10:35:07 +02:00
span.md	Add SpanRuler component (#9880 )	2022-06-02 13:12:53 +02:00
spancategorizer.md	Update docs for pipeline initialize() methods (#11221 )	2022-08-03 16:53:02 +02:00
spangroup.md	Override SpanGroups.setdefault to provide default SpanGroup (#10772 )	2022-05-12 10:06:25 +02:00
spanruler.md	the 'new' indicator wants a 'number' (#10997 )	2022-06-21 22:01:16 +02:00
stringstore.md	Fix misspelt keyword in StringStore example	2022-05-29 10:49:19 +01:00
tagger.md	Update docs for pipeline initialize() methods (#11221 )	2022-08-03 16:53:02 +02:00
textcategorizer.md	Update docs for pipeline initialize() methods (#11221 )	2022-08-03 16:53:02 +02:00
tok2vec.md	Update docs for pipeline initialize() methods (#11221 )	2022-08-03 16:53:02 +02:00
token.md	`token.md`: Fix documentation of `Token.ancestors` (#10917 )	2022-06-06 14:32:36 +09:00
tokenizer.md	Add tokenizer option to allow Matcher handling for all rules (#10452 )	2022-03-24 13:21:32 +01:00
top-level.md	Add Latin language support (#11349 )	2022-08-30 14:04:54 +02:00
transformer.md	Update docs for pipeline initialize() methods (#11221 )	2022-08-03 16:53:02 +02:00
vectors.md	Docs for v3.3 (#10628 )	2022-04-28 14:09:35 +02:00
vocab.md	Add vector deduplication (#10551 )	2022-03-30 08:54:23 +02:00