spaCy

mirror of https://github.com/explosion/spaCy.git synced 2024-12-26 18:06:29 +03:00

Author	SHA1	Message	Date
broaddeep	ee159b8543	Support match alignments (#7321 ) * Support match alignments * change naming from match_alignments to with_alignments, add conditional flow if with_alignments is given, validate with_alignments, add related test case * remove added errors, utilize bint type, cleanup whitespace * fix no new line in end of file * Minor formatting * Skip alignments processing if as_spans is set * Add with_alignments to Matcher API docs * Update website/docs/api/matcher.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-04-08 18:10:14 +10:00
Sam Edwardes	f6ad4684bd	Updates to universe.json for spaCyTextBlob (#7647 ) * Updates to universe.json for spaCyTextBlob Updated the documentation for spaCy 3.0. * SamEdwardes.md * Update SamEdwardes.md	2021-04-04 20:17:57 +02:00
Ayush Chaurasia	3c2ce41dd8	W&B integration: Optional support for dataset and model checkpoint logging and versioning (#7429 ) * Add optional artifacts logging * Update docs * Update spacy/training/loggers.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update spacy/training/loggers.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update spacy/training/loggers.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Bump WandbLogger Version * Add documentation of v1 to legacy docs * bump spacy-legacy to 3.0.2 (to be released) Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>	2021-04-01 19:36:23 +02:00
bsweileh	61472e7cb3	Update _training.md - Fix broken link on backpropagation (#7431 ) * Update _training.md Fix broken link on backpropagation * Add agreement add spacy contributor agreement	2021-03-15 09:21:35 +01:00
Ines Montani	37fc495f5d	Merge pull request #7353 from jankrepl/fix_entity_rules_labels	2021-03-09 15:09:24 +01:00
Ines Montani	4f32e3dedb	Update issue templates [ci skip]	2021-03-10 01:08:05 +11:00
Jan Krepl	0e1d579f0c	Add agreement	2021-03-09 10:57:32 +01:00
Boian Tzonev	cca8651fc8	Bulgarian tokenizer exceptions (#7114 ) * [Bulgarian] Add tokenizer exceptions and like_num for Bulgarian * [Bulgarian] Add tokenizer exceptions and like_num for Bulgarian	2021-02-19 19:19:19 +01:00
Peter Baumann	61b04a70d5	Run PhraseMatcher on Spans (#6918 ) * Add regression test * Run PhraseMatcher on Spans * Add test for PhraseMatcher on Spans and Docs * Add SCA * Add test with 3 matches in Doc, 1 match in Span * Update docs * Use doc.length for find_matches in tokenizer Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2021-02-10 23:43:32 +11:00
René Octavio Queiroz Dias	999ff03b19	fix: Fix textcat labels to expect a Optional[Iterable[str]] instead of Optional[Dict] (#6911 ) * docs: Add agreement * bug: Regression test Issue #6908 * fix: Changed from Dict to Iterable[str] Fix #6908 * Update test to use make_tempdir * fix: Fix WindowsPath error Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2021-02-04 23:37:13 +01:00
Helio Machado	20a97cda38	Create 0x2b3bfa0.md (#6916 )	2021-02-04 23:25:11 +01:00
Ines Montani	30765674d0	Merge branch 'master' into develop	2021-01-30 12:20:28 +11:00
Pamphile ROY	e496b8623f	SCA tupui	2021-01-29 15:46:53 +01:00
Ines Montani	230e651ad6	Merge branch 'develop' into master-tmp	2021-01-27 13:26:29 +11:00
Ines Montani	d5ef245bb1	Merge pull request #6822 from jganseman/master [ci skip]	2021-01-27 13:04:30 +11:00
jganseman	c9103d60fa	Create jganseman.md	2021-01-26 11:02:31 +01:00
Dhruv Naik	e7db07a0b9	Fix Span.char_span bug (#6816 ) * Create dhruvrnaik.md * add test for issue #6815 * bugfix for issue #6815 * update dhruvrnaik.md * add span.vector test for #6815	2021-01-26 15:50:37 +08:00
muratjumashev	79327197d1	Add contributor agreement	2021-01-25 00:34:12 +06:00
KeshavG-lb	0a86d833d7	Spacy Cli info method causing backward compatibility issues (#6793 ) * Spacy Cli info method causing backward compatibility issues #6791 fix backward compatibility by setting default value to exclude in info method. * setting empty list as default argument is dangerous. so setting default to None and then setting it to emptylist, if None. Reference : https://nikos7am.com/posts/mutable-default-arguments/	2021-01-23 11:21:43 +01:00
Luigi Coniglio	e83c818a78	DependencyMatcher improvements (fix #6678 ) (#6744 ) * Adding contributor agreement for user werew * [DependencyMatcher] Comment and clean code * [DependencyMatcher] Use defaultdicts * [DependencyMatcher] Simplify _retrieve_tree method * [DependencyMatcher] Remove prepended underscores * [DependencyMatcher] Address TODO and move grouping of token's positions out of the loop * [DependencyMatcher] Remove _nodes attribute * [DependencyMatcher] Use enumerate in _retrieve_tree method * [DependencyMatcher] Clean unused vars and use camel_case naming * [DependencyMatcher] Memoize node+operator map * Add root property to Token * [DependencyMatcher] Groups matches by root * [DependencyMatcher] Remove unused _keys_to_token attribute * [DependencyMatcher] Use a list to map tokens to matcher's keys * [DependencyMatcher] Remove recursion * [DependencyMatcher] Use a generator to retrieve matches * [DependencyMatcher] Remove unused memory pool * [DependencyMatcher] Hide private methods and attributes * [DependencyMatcher] Improvements to the matches validation * Apply suggestions from code review Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com> * [DependencyMatcher] Fix keys_to_position_maps * Remove Token.root property * [DependencyMatcher] Remove functools' lru_cache Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>	2021-01-22 11:20:08 +11:00
Adriane Boyd	0c936004d1	Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-rc3	2021-01-14 11:49:58 +01:00
Antonio Miras	b4bd8f347a	spaCy Universe: New project; SpacyDotNet (#6702 ) * Universe: SpacyDotNet a .NET Core spaCy wrapper * Signed contributor agreement Co-authored-by: Antonio Miras <antonio@amiras.net>	2021-01-13 12:47:30 +11:00
Alex Combessie	9cc880014c	Remove questionable French stopwords (#6310 ) * Remove questionable French stopwords * Create alexcombessie.md	2021-01-08 11:36:22 +11:00
Cristiana S Parada	7a0222f260	Update stop_words.py in Portuguese (a,o,e) (#6345 ) * Update stop_words.py Added three aditional stopwords: "a" and "o" that means "the", and "e" that means "and" * Create cristianasp.md * zero edit to push CI Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-01-08 11:35:38 +11:00
Lorena Ciutacu	f11002f1f1	add new Romanian stopwords (#6621 ) * add contributor agreement * update ro stopwords list * add new stopwords	2021-01-08 11:34:47 +11:00
ophelielacroix	e3222fdec9	Add (noun chunks) syntax iterators for Danish (#6246 ) * add syntax iterators for danish * add test noun chunks for danish syntax iterators * add contributor agreement * update da syntax iterators to remove nested chunks * add tests for da noun chunks * Fix test * add missing import * fix example * Prevent overlapping noun chunks Prevent overlapping noun chunks by tracking the end index of the previous noun chunk span. Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2021-01-07 16:33:00 +11:00
Bruno	1a77607036	spaCy v3 is not saving the best version in training loop (#6629 ) * Save best only if is the best and also respect the average config * Create bratao.md * Update loop.py * Remove average check * Keep before_to_disk	2021-01-06 12:51:30 +11:00
Yosi	cf52510631	Add Amharic አማርኛ Language support (#6583 ) * Add Amharic to space * clean up * Add some PRON_LEMMA * add Tigrinya support * remove text_noun_chunks * Tigrinya Support * added some more details for ti * fix unit test * add amharic char range * changes from review * amharic and tigrinya share same unicode block * get rid of _amharic/_tigrinya in char_classes Co-authored-by: Josiah Solomon <jsolomon@meteorcomm.com>	2020-12-22 16:50:34 +01:00
Ines Montani	d8aa113d16	Merge pull request #6566 from rafguns/cite-zenodo [ci skip]	2020-12-16 16:40:50 +11:00
Thomas Bird	f6e4378942	Add SCA for @thomasbird (#6576 )	2020-12-15 20:59:47 +01:00
Raf Guns	ec876c9713	Merge branch 'master' of https://github.com/explosion/spaCy into cite-zenodo	2020-12-14 22:03:58 +01:00
Raf Guns	a90ca0e1fb	Add contributor agreement	2020-12-14 22:01:14 +01:00
Ines Montani	85ca8c2bdd	Merge branch 'master' into develop	2020-12-11 13:44:41 +11:00
Ines Montani	1d4b1dea25	Update contributing guide and issue template [ci skip]	2020-12-11 13:39:26 +11:00
Ines Montani	c9b67b02f8	Update issue templates	2020-12-11 10:05:47 +11:00
svlandeg	4afcd9567e	refer to GH discussions	2020-12-10 20:56:12 +01:00
Adriane Boyd	724831b066	Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master * Update Macedonian for v3 * Update Turkish for v3	2020-11-25 11:49:34 +01:00
Jacob Bortell	992723dfac	Add jabortell to the contributors (#6422 ) * Add jabortell to the contributors * Update jabortell.md Added tick to applicable statement	2020-11-24 16:15:31 +01:00
Yusuke Mori	e3ac90b035	Avoid a SyntaxError in self-attentive-parser (#6428 ) * Avoid a SyntaxError in self-attentive-parser Fix a usage of quotation marks in the example of spaCy Universe self-attentive-parser * Create forest1988.md Fill in the spaCy contributor agreement	2020-11-22 21:59:37 +01:00
M. Revuelta Espinosa	51232ffb9e	Update universe.json (include PatternOmatic) (#6399 ) Request to include PatternOmatic in spaCy Universe Adds @revuel to contributors	2020-11-19 13:15:50 +01:00
Daniel Vasic	20d72de986	Added Multext-East V5 tagset for Croatian language (#6248 ) * Added Multext-East V5 tagset for Croatian language * Create danielvasic.md * Update danielvasic.md * Update danielvasic.md * Add tag map to CroatianDefaults Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2020-11-05 12:19:22 +01:00
Vu Ha	6d465ec52c	add oprd to the list of accepted deps for noun chunking (#6302 ) * add oprd to the list of accepted deps for noun chunking * add SCA	2020-11-05 09:17:35 +01:00
Ines Montani	1e4d7e059f	Revert "Test FUNDING.yml [ci skip]" This reverts commit `287be48ad0`.	2020-10-28 17:42:23 +01:00
Ines Montani	287be48ad0	Test FUNDING.yml [ci skip]	2020-10-28 17:36:25 +01:00
Robert Šípek	260c29794a	Fill contributor agreement by robertsipek (#6285 ) * Fill contributor agreement by robertsipek * Fill contributor agreement by robertsipek	2020-10-22 22:13:17 +02:00
Kunal Sharma	01aec7a313	Adding MindMeld to Universe JSON (#6275 ) * Adding Mindmeld to Universe JSON Mindmeld is a conversational AI platform for deep-domain voice interfaces and chatbots. https://www.mindmeld.com/ * Signing contribution agreement. Co-authored-by: kunshar2 <kunshar2@cisco.com>	2020-10-21 18:42:11 +02:00
walterhenry	ff82644746	User contributor agreement Here it is!	2020-10-19 16:25:09 +02:00
Jan Margeta	ed1c37189a	Add contributor agreement for jmargeta	2020-10-16 00:38:42 +02:00
Borijan Georgievski	2311192ba1	Include Macedonian language (#6230 ) * Include Macedonian language * Fix indentation at char_classes.py * Fix indentation at char_classes.py * Add Macedonian tests, update lex_attrs and char_classes * Import unicode literals for python 2	2020-10-15 15:55:01 +02:00
Ines Montani	178760855f	Merge branch 'develop' into master-tmp	2020-10-15 09:06:03 +02:00

1 2 3 4 5 ...

467 Commits