spaCy

mirror of https://github.com/explosion/spaCy.git synced 2024-11-14 13:47:13 +03:00

Author	SHA1	Message	Date
Ines Montani	dd34a3a433	Try simpler approach [ci skip]	2021-07-02 17:40:49 +10:00
Ines Montani	2898331494	Improve logic [ci skip]	2021-07-02 17:37:35 +10:00
Ines Montani	519a9e29be	Fix git login [ci skip]	2021-07-02 17:30:59 +10:00
Ines Montani	8961f36415	Commit manually in workflow [ci skip]	2021-07-02 17:27:48 +10:00
Ines Montani	2a5cbf1b0c	Test different workflow trigger [ci skip]	2021-07-02 17:22:43 +10:00
Ines Montani	bbbaae0b5e	Update triggers [ci skip]	2021-07-02 17:10:24 +10:00
Ines Montani	cdefb8cf1b	Experimental: add autoblack.yml action [ci skip]	2021-07-02 17:07:05 +10:00
julien-talkair	6b1f9a5be0	add spacy contributor agreement	2021-07-01 17:41:12 +02:00
Ines Montani	88ad41316c	Update issue template [ci skip]	2021-06-28 03:11:37 +02:00
Ines Montani	db6361ab6e	Update issue template [ci skip]	2021-06-28 03:10:52 +02:00
Ines Montani	2e453bda92	Update issue links [ci skip]	2021-06-28 03:09:48 +02:00
Paul O'Leary McCann	0d3caa52a6	Update New Issue choices This uses some new features related to Issue Templates to help direct more people to Discussions. 1. Change the Discussions option to link to Discussions 2. Add a link to the FAQ 3. Disable blank issues	2021-06-27 14:41:33 +09:00
Adrian Zuber	f5aee0bbdf	Raise custom error in EntityLinker when KB is not set (#8442 ) * Raise custom error in EntityLinker when KB is not set * add contributor agreement * Update E1018 error message	2021-06-25 23:04:00 +02:00
Adriane Boyd	172dfec4f2	Test download in CI with ca_core_news_sm (#8493 )	2021-06-24 09:26:30 +02:00
Giovanni Toffoli	19521d525b	Added Italian POS-aware lemmatizer. (#8079 ) * Added Italian POS-aware lemmatizer. Also added the code used to build the lookup tables by POS. * Create gtoffoli.md * Add imports and format * Remove helper script * Use lemma_lookup instead of lemma_lookup_legacy Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2021-06-16 11:14:45 +02:00
Adriane Boyd	33240ed2c5	Temporarily skip model download test	2021-06-16 10:14:42 +02:00
Adriane Boyd	d52ab13b5f	Update CI: update ubuntu image, add download test (#8298 ) * Update CI: update ubuntu image, add download test * Switch instances to `ubuntu-18.04` * Add model download test, currently only for one job with python 3.8 * Fix variable name * Set variables explicitly	2021-06-07 14:46:07 +02:00
Vito De Tullio	3672464e25	applying suggestion to avoid mypy errors (#8265 ) * applying suggestion to avoid mypy errors * sign contributor agreement	2021-06-02 19:25:30 +10:00
Kristian Boda	dc8d8d15d2	Add hmrb to spaCy Universe (#8129 ) * docs: add hmrb to spacy universe * docs: add sentence on spacy versions * docs: update description and images * misc: add spaCy Contributor Agreement	2021-05-31 18:40:48 +10:00
Narayan Acharya	6b79714080	Address missing config overrides post load of models (#8208 )	2021-05-31 18:36:52 +10:00
Julien Salinas	a176d2209a	Sign contributors agreement.	2021-05-14 11:00:27 +02:00
Sevdimali	49aed683cc	Azerbaijani language added (#7911 )	2021-04-28 14:42:02 +02:00
Adriane Boyd	f4080983ea	Extend to cupy 9.0.0 (#7914 )	2021-04-28 10:18:24 +02:00
Janis Klaise	1690595e4d	Update load_lookups return type and docstring (#7907 ) * Update load_lookups return type and docstring * Add contributor agreement	2021-04-27 09:13:39 +02:00
Adriane Boyd	36ecba224e	Set up GPU CI testing (#7293 ) * Set up CI for tests with GPU agent * Update tests for enabled GPU * Fix steps filename * Add parallel build jobs as a setting * Fix test requirements * Fix install test requirements condition * Fix pipeline models test * Reset current ops in prefer/require testing * Fix more tests * Remove separate test_models test * Fix regression 5551 * fix StaticVectors for GPU use * fix vocab tests * Fix regression test 5082 * Move azure steps to .github and reenable default pool jobs * Consolidate/rename azure steps Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>	2021-04-22 14:58:29 +02:00
meghanabhange	49ff1126bf	Project Idea : denomme \| Multilingual Name Detection (#7845 ) * Add denomme * spaCy contributor agreement Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2021-04-22 08:48:17 +02:00
Pierre Lison	2f0ef2c9cc	adding skweak to the SpaCy universe	2021-04-22 01:16:34 +02:00
Shantam Raj	6017fcf693	Default code for Setting Entity annotations on the website errors (#7738 ) * the default example for "Setting entity annotations" errors on Binder * updating contributer info * using a new variable to store original entities	2021-04-21 09:16:32 +02:00
broaddeep	ee159b8543	Support match alignments (#7321 ) * Support match alignments * change naming from match_alignments to with_alignments, add conditional flow if with_alignments is given, validate with_alignments, add related test case * remove added errors, utilize bint type, cleanup whitespace * fix no new line in end of file * Minor formatting * Skip alignments processing if as_spans is set * Add with_alignments to Matcher API docs * Update website/docs/api/matcher.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-04-08 18:10:14 +10:00
Sam Edwardes	f6ad4684bd	Updates to universe.json for spaCyTextBlob (#7647 ) * Updates to universe.json for spaCyTextBlob Updated the documentation for spaCy 3.0. * SamEdwardes.md * Update SamEdwardes.md	2021-04-04 20:17:57 +02:00
Ayush Chaurasia	3c2ce41dd8	W&B integration: Optional support for dataset and model checkpoint logging and versioning (#7429 ) * Add optional artifacts logging * Update docs * Update spacy/training/loggers.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update spacy/training/loggers.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update spacy/training/loggers.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Bump WandbLogger Version * Add documentation of v1 to legacy docs * bump spacy-legacy to 3.0.2 (to be released) Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>	2021-04-01 19:36:23 +02:00
bsweileh	61472e7cb3	Update _training.md - Fix broken link on backpropagation (#7431 ) * Update _training.md Fix broken link on backpropagation * Add agreement add spacy contributor agreement	2021-03-15 09:21:35 +01:00
Ines Montani	37fc495f5d	Merge pull request #7353 from jankrepl/fix_entity_rules_labels	2021-03-09 15:09:24 +01:00
Ines Montani	4f32e3dedb	Update issue templates [ci skip]	2021-03-10 01:08:05 +11:00
Jan Krepl	0e1d579f0c	Add agreement	2021-03-09 10:57:32 +01:00
Boian Tzonev	cca8651fc8	Bulgarian tokenizer exceptions (#7114 ) * [Bulgarian] Add tokenizer exceptions and like_num for Bulgarian * [Bulgarian] Add tokenizer exceptions and like_num for Bulgarian	2021-02-19 19:19:19 +01:00
Peter Baumann	61b04a70d5	Run PhraseMatcher on Spans (#6918 ) * Add regression test * Run PhraseMatcher on Spans * Add test for PhraseMatcher on Spans and Docs * Add SCA * Add test with 3 matches in Doc, 1 match in Span * Update docs * Use doc.length for find_matches in tokenizer Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2021-02-10 23:43:32 +11:00
René Octavio Queiroz Dias	999ff03b19	fix: Fix textcat labels to expect a Optional[Iterable[str]] instead of Optional[Dict] (#6911 ) * docs: Add agreement * bug: Regression test Issue #6908 * fix: Changed from Dict to Iterable[str] Fix #6908 * Update test to use make_tempdir * fix: Fix WindowsPath error Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2021-02-04 23:37:13 +01:00
Helio Machado	20a97cda38	Create 0x2b3bfa0.md (#6916 )	2021-02-04 23:25:11 +01:00
Ines Montani	30765674d0	Merge branch 'master' into develop	2021-01-30 12:20:28 +11:00
Pamphile ROY	e496b8623f	SCA tupui	2021-01-29 15:46:53 +01:00
Ines Montani	230e651ad6	Merge branch 'develop' into master-tmp	2021-01-27 13:26:29 +11:00
Ines Montani	d5ef245bb1	Merge pull request #6822 from jganseman/master [ci skip]	2021-01-27 13:04:30 +11:00
jganseman	c9103d60fa	Create jganseman.md	2021-01-26 11:02:31 +01:00
Dhruv Naik	e7db07a0b9	Fix Span.char_span bug (#6816 ) * Create dhruvrnaik.md * add test for issue #6815 * bugfix for issue #6815 * update dhruvrnaik.md * add span.vector test for #6815	2021-01-26 15:50:37 +08:00
muratjumashev	79327197d1	Add contributor agreement	2021-01-25 00:34:12 +06:00
KeshavG-lb	0a86d833d7	Spacy Cli info method causing backward compatibility issues (#6793 ) * Spacy Cli info method causing backward compatibility issues #6791 fix backward compatibility by setting default value to exclude in info method. * setting empty list as default argument is dangerous. so setting default to None and then setting it to emptylist, if None. Reference : https://nikos7am.com/posts/mutable-default-arguments/	2021-01-23 11:21:43 +01:00
Luigi Coniglio	e83c818a78	DependencyMatcher improvements (fix #6678 ) (#6744 ) * Adding contributor agreement for user werew * [DependencyMatcher] Comment and clean code * [DependencyMatcher] Use defaultdicts * [DependencyMatcher] Simplify _retrieve_tree method * [DependencyMatcher] Remove prepended underscores * [DependencyMatcher] Address TODO and move grouping of token's positions out of the loop * [DependencyMatcher] Remove _nodes attribute * [DependencyMatcher] Use enumerate in _retrieve_tree method * [DependencyMatcher] Clean unused vars and use camel_case naming * [DependencyMatcher] Memoize node+operator map * Add root property to Token * [DependencyMatcher] Groups matches by root * [DependencyMatcher] Remove unused _keys_to_token attribute * [DependencyMatcher] Use a list to map tokens to matcher's keys * [DependencyMatcher] Remove recursion * [DependencyMatcher] Use a generator to retrieve matches * [DependencyMatcher] Remove unused memory pool * [DependencyMatcher] Hide private methods and attributes * [DependencyMatcher] Improvements to the matches validation * Apply suggestions from code review Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com> * [DependencyMatcher] Fix keys_to_position_maps * Remove Token.root property * [DependencyMatcher] Remove functools' lru_cache Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>	2021-01-22 11:20:08 +11:00
Adriane Boyd	0c936004d1	Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-rc3	2021-01-14 11:49:58 +01:00
Antonio Miras	b4bd8f347a	spaCy Universe: New project; SpacyDotNet (#6702 ) * Universe: SpacyDotNet a .NET Core spaCy wrapper * Signed contributor agreement Co-authored-by: Antonio Miras <antonio@amiras.net>	2021-01-13 12:47:30 +11:00
Alex Combessie	9cc880014c	Remove questionable French stopwords (#6310 ) * Remove questionable French stopwords * Create alexcombessie.md	2021-01-08 11:36:22 +11:00
Cristiana S Parada	7a0222f260	Update stop_words.py in Portuguese (a,o,e) (#6345 ) * Update stop_words.py Added three aditional stopwords: "a" and "o" that means "the", and "e" that means "and" * Create cristianasp.md * zero edit to push CI Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-01-08 11:35:38 +11:00
Lorena Ciutacu	f11002f1f1	add new Romanian stopwords (#6621 ) * add contributor agreement * update ro stopwords list * add new stopwords	2021-01-08 11:34:47 +11:00
ophelielacroix	e3222fdec9	Add (noun chunks) syntax iterators for Danish (#6246 ) * add syntax iterators for danish * add test noun chunks for danish syntax iterators * add contributor agreement * update da syntax iterators to remove nested chunks * add tests for da noun chunks * Fix test * add missing import * fix example * Prevent overlapping noun chunks Prevent overlapping noun chunks by tracking the end index of the previous noun chunk span. Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2021-01-07 16:33:00 +11:00
Bruno	1a77607036	spaCy v3 is not saving the best version in training loop (#6629 ) * Save best only if is the best and also respect the average config * Create bratao.md * Update loop.py * Remove average check * Keep before_to_disk	2021-01-06 12:51:30 +11:00
Yosi	cf52510631	Add Amharic አማርኛ Language support (#6583 ) * Add Amharic to space * clean up * Add some PRON_LEMMA * add Tigrinya support * remove text_noun_chunks * Tigrinya Support * added some more details for ti * fix unit test * add amharic char range * changes from review * amharic and tigrinya share same unicode block * get rid of _amharic/_tigrinya in char_classes Co-authored-by: Josiah Solomon <jsolomon@meteorcomm.com>	2020-12-22 16:50:34 +01:00
Ines Montani	d8aa113d16	Merge pull request #6566 from rafguns/cite-zenodo [ci skip]	2020-12-16 16:40:50 +11:00
Thomas Bird	f6e4378942	Add SCA for @thomasbird (#6576 )	2020-12-15 20:59:47 +01:00
Raf Guns	ec876c9713	Merge branch 'master' of https://github.com/explosion/spaCy into cite-zenodo	2020-12-14 22:03:58 +01:00
Raf Guns	a90ca0e1fb	Add contributor agreement	2020-12-14 22:01:14 +01:00
Ines Montani	85ca8c2bdd	Merge branch 'master' into develop	2020-12-11 13:44:41 +11:00
Ines Montani	1d4b1dea25	Update contributing guide and issue template [ci skip]	2020-12-11 13:39:26 +11:00
Ines Montani	c9b67b02f8	Update issue templates	2020-12-11 10:05:47 +11:00
svlandeg	4afcd9567e	refer to GH discussions	2020-12-10 20:56:12 +01:00
Adriane Boyd	724831b066	Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master * Update Macedonian for v3 * Update Turkish for v3	2020-11-25 11:49:34 +01:00
Jacob Bortell	992723dfac	Add jabortell to the contributors (#6422 ) * Add jabortell to the contributors * Update jabortell.md Added tick to applicable statement	2020-11-24 16:15:31 +01:00
Yusuke Mori	e3ac90b035	Avoid a SyntaxError in self-attentive-parser (#6428 ) * Avoid a SyntaxError in self-attentive-parser Fix a usage of quotation marks in the example of spaCy Universe self-attentive-parser * Create forest1988.md Fill in the spaCy contributor agreement	2020-11-22 21:59:37 +01:00
M. Revuelta Espinosa	51232ffb9e	Update universe.json (include PatternOmatic) (#6399 ) Request to include PatternOmatic in spaCy Universe Adds @revuel to contributors	2020-11-19 13:15:50 +01:00
Daniel Vasic	20d72de986	Added Multext-East V5 tagset for Croatian language (#6248 ) * Added Multext-East V5 tagset for Croatian language * Create danielvasic.md * Update danielvasic.md * Update danielvasic.md * Add tag map to CroatianDefaults Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2020-11-05 12:19:22 +01:00
Vu Ha	6d465ec52c	add oprd to the list of accepted deps for noun chunking (#6302 ) * add oprd to the list of accepted deps for noun chunking * add SCA	2020-11-05 09:17:35 +01:00
Ines Montani	1e4d7e059f	Revert "Test FUNDING.yml [ci skip]" This reverts commit `287be48ad0`.	2020-10-28 17:42:23 +01:00
Ines Montani	287be48ad0	Test FUNDING.yml [ci skip]	2020-10-28 17:36:25 +01:00
Robert Šípek	260c29794a	Fill contributor agreement by robertsipek (#6285 ) * Fill contributor agreement by robertsipek * Fill contributor agreement by robertsipek	2020-10-22 22:13:17 +02:00
Kunal Sharma	01aec7a313	Adding MindMeld to Universe JSON (#6275 ) * Adding Mindmeld to Universe JSON Mindmeld is a conversational AI platform for deep-domain voice interfaces and chatbots. https://www.mindmeld.com/ * Signing contribution agreement. Co-authored-by: kunshar2 <kunshar2@cisco.com>	2020-10-21 18:42:11 +02:00
walterhenry	ff82644746	User contributor agreement Here it is!	2020-10-19 16:25:09 +02:00
Jan Margeta	ed1c37189a	Add contributor agreement for jmargeta	2020-10-16 00:38:42 +02:00
Borijan Georgievski	2311192ba1	Include Macedonian language (#6230 ) * Include Macedonian language * Fix indentation at char_classes.py * Fix indentation at char_classes.py * Add Macedonian tests, update lex_attrs and char_classes * Import unicode literals for python 2	2020-10-15 15:55:01 +02:00
Ines Montani	178760855f	Merge branch 'develop' into master-tmp	2020-10-15 09:06:03 +02:00
Florijan Stamenković	18f5c309dc	Fix Issue 6207 (#6208 ) * Regression test for issue 6207 * Fix issue 6207 * Sign contributor agreement * Minor adjustments to test Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2020-10-09 10:14:40 +02:00
Šarūnas Navickas	287ba94a2f	Website (Universe): An entry for rita-dsl (#6138 ) * Create zaibacu.md * Add RITA-DSL entry * Update agreement * Fix formatting	2020-10-09 10:14:40 +02:00
delzac	668507be1b	Reflect on usage doc that IS_SENT_START attribute exist (#6114 ) * Reflect on usage doc that IS_SENT_START attribute exist * Create delzac.md	2020-10-09 10:14:40 +02:00
Rahul Gupta	1a00bff06d	Hindi: Adds tests for lexical attributes (norm and like_num) (#5829 ) * Hindi: Adds tests for lexical attributes (norm and like_num) * Signs and sdds the contributor agreement * Add ordinal numbers to be tagged as like_num * Adds alternate pronunciation for 31 and 39	2020-10-07 10:23:32 +02:00
Nuccy90	c809b2c8e7	Update morph_rules.py (#6102 ) * Update morph_rules.py Added "dig" and "dej" ("you" in accusative form) * Create Nuccy90.md * Update Nuccy90.md	2020-10-06 15:14:47 +02:00
delzac	15ea401b39	Reflect on usage doc that IS_SENT_START attribute exist (#6114 ) * Reflect on usage doc that IS_SENT_START attribute exist * Create delzac.md	2020-10-06 15:11:01 +02:00
Šarūnas Navickas	047fb9f8b8	Website (Universe): An entry for rita-dsl (#6138 ) * Create zaibacu.md * Add RITA-DSL entry * Update agreement * Fix formatting	2020-10-06 11:19:36 +02:00
Florijan Stamenković	9db670b996	Fix Issue 6207 (#6208 ) * Regression test for issue 6207 * Fix issue 6207 * Sign contributor agreement * Minor adjustments to test Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2020-10-06 11:17:37 +02:00
Ines Montani	59deeb7da6	Merge branch 'develop' into master-tmp	2020-10-04 14:52:20 +02:00
Stanislav Schmidt	3589a64d44	Change type of texts argument in pipe to iterable (#6186 ) * Change type of texts argument in pipe to iterable * Add contributor agreement	2020-10-02 21:00:11 +02:00
Muhammad Fahmi Rasyid	7489d02dea	Update Indonesian Example Phrases (#6124 ) * create contributor agreement * Update Indonesian example. (see #1107) Update Indonesian examples with more proper phrases. the current phrases contains sensitive and violent words.	2020-09-23 14:02:26 +02:00
Ines Montani	864a697e63	Merge branch 'develop' into master-tmp	2020-09-04 13:15:36 +02:00
Juan Gutiérrez	9002bea29f	Update suffixes example (#5989 ) * Update suffixes example The current example will throw `TypeError: can only concatenate list (not "tuple") to list` * Signing Contributor Agreement	2020-08-31 12:44:56 +02:00
Shashank	450720aca2	Added support for Sanskrit language (#5956 ) * Added support for Sanskrit language * Added tests for lexical attribute like_num	2020-08-25 10:56:29 +02:00
idoshr	b10c7bc56e	Hebrew like num (#5952 ) * Update stop_words.py Hebrew STOP WORDS * Update stop_words.py * contributor * contributor * add some common domain extentions support human number 1K/1M.... * support human number 1K/1M.... * hebrew number tokenize 1K/1M implement in EN * test human tokenize fix * test * heb like num revert human number change * heb like num	2020-08-24 14:30:05 +02:00
Attila Szász	669dc70822	Create tilusnet.md (#5914 )	2020-08-12 22:46:08 +02:00
Adam Bittlingmayer	7b33b2854f	Add Armenian sentence-final verchaket, Greek question mark and Arabic question mark to default punct (#5910 ) * Add Armenian sentence-final verchaket * Add Greek and Arabic question marks, and contributor agreement * Check box	2020-08-12 15:36:14 +02:00
graue70	49e690bde1	Fix typos in comments (#5904 ) * Fix typo in comment * Fix typo * Add spaCy Contributor Agreement	2020-08-12 15:35:25 +02:00
holubvl3	d16c0f2c3a	Create holubvl3 (#5845 ) * Create holubvl3 * Rename holubvl3 to holubvl3.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2020-07-30 17:40:31 +02:00
Gustavo Zadrozny Leyendecker	90b958fd01	Fix on EntityRendered to support break lines (after last entity) (closes #5838 )	2020-07-29 18:48:39 +02:00
Li Zhe	a69eb445dc	fix the wrong hash url in adding-languages.md file (#5810 ) * fix the wrong hash url in adding-languages.md file change the #101 url hash path to #language-data * filled in the spaCy Contributor Agreement filled in the spaCy Contributor Agreement	2020-07-25 13:13:38 +02:00
Joshua Olson	6d4d5c074c	Mark Japanese documents as tagged. (#5803 ) Mark the document as tagged before returning it to the user from the JapaneseTokenizer. Fixes #5802	2020-07-23 08:57:01 +02:00
Ines Montani	644074b954	Merge branch 'develop' into master-tmp	2020-07-20 14:58:04 +02:00
Alec Chapman	a8978ca285	Add VA COVID-19 NLP project to spaCy Universe (#5777 ) * Update universe.json Add cov-bsv to "resources" * Update universe.json * add contributor agreement	2020-07-19 13:35:31 +02:00
gandersen101	9097549227	Adding spaczz package to universe.json (#5717 ) * Adding spaczz package to universe.json * Adding contributor agreement.	2020-07-07 20:55:24 +02:00
Jonathan Besomi	546f3d10d4	Add texthero to universe.json (#5716 ) * Add texthero to universe.json * Add spaCy contributor Agreement	2020-07-07 20:54:22 +02:00
Mike Izbicki	7a2ca00794	fix bug in Korean language, resulting in 100x speedup by reducing overhead of mecab (#5701 ) * speed up Korean nlp 100x by stopping mecab from reloading on each doc * add contributor agreement * rename variables to improve code readability	2020-07-06 17:03:33 +02:00
Sebastián Ramírez	b985cc4025	📄 Add spaCy Contributor Agreement	2020-07-01 20:57:21 +02:00
Ines Montani	414dc7ace1	Merge branch 'spacy.io' into spacy.io-develop	2020-07-01 11:47:47 +02:00
Matthias Hertel	305221f3e5	Website: fixed the token span in the text about the rule-based matching example (#5669 ) * fixed token span in pattern matcher example * contributor agreement	2020-06-30 19:58:55 +02:00
Matthias Hertel	8b0f749606	Website: fixed the token span in the text about the rule-based matching example (#5669 ) * fixed token span in pattern matcher example * contributor agreement	2020-06-30 19:58:23 +02:00
PluieElectrique	90c7eb0e2f	Reduce memory usage of Lookup's BloomFilter (#5606 ) * Reduce memory usage of Lookup's BloomFilter * Remove extra Table update	2020-06-26 14:09:10 +02:00
Richard Liaw	0ef78bad93	contribute (#5632 )	2020-06-23 08:53:58 +02:00
Rameshh	c34420794a	Add Nepali Language (#5622 ) * added support for nepali lang * added examples and test files * added spacy contributor agreement	2020-06-22 10:25:46 +02:00
Karen Hambardzumyan	ff6a084e9c	Create mahnerak.md (#5615 )	2020-06-20 11:14:26 +02:00
Marat M. Yavrumyan	ccd7edf04b	Create myavrum.md (#5612 )	2020-06-19 18:34:27 +02:00
Arvind Srinivasan	aa5b40fa64	Added Tamil Example Sentences (#5583 ) * Added Examples for Tamil Sentences #### Description This PR add example sentences for the Tamil language which were missing as per issue #1107 #### Type of Change This is an enhancement. * Accepting spaCy Contributor Agreement * Signed on my behalf as an individual	2020-06-13 15:56:26 +02:00
theudas	fa46e0bef2	Added Parameter to NEL to take n sentences into account (#5548 ) * added setting for neighbour sentence in NEL * added spaCy contributor agreement * added multi sentence also for training * made the try-except block smaller	2020-06-12 02:03:23 +02:00
Sofie Van Landeghem	18c6dc8093	removing label both on comment and on close	2020-06-11 14:09:40 +02:00
Jones Martins	28db7dd5d9	Add missing pronoums/determiners (#5569 ) * Add missing pronoums/determiners * Add test for missing pronoums * Add contributor file	2020-06-10 18:47:04 +02:00
Sofie Van Landeghem	12c1965070	set delay to 7 days	2020-06-10 10:46:12 +02:00
Sofie Van Landeghem	86112d2168	update issue manager's version	2020-06-09 08:57:38 +02:00
Martino Mensio	de00f967ce	adding spacy-universal-sentence-encoder (#5534 ) * adding spacy-universal-sentence-encoder * update affiliation * updated code example	2020-06-08 20:26:30 +02:00
Sofie Van Landeghem	d1799da200	bot for answered issues (#5563 ) * add tiangolo's issue manager * fix formatting * spaces, tabs, who knows * formatting * I'll get this right at some point * maybe one more space ?	2020-06-08 19:47:32 +02:00
Hiroshi Matsuda	456bf47f51	fix a bug causing mis-alignments (#5560 )	2020-06-08 15:49:34 +02:00
Leo	7d5a89661e	contributor agreement signed (#5525 )	2020-05-31 20:13:39 +02:00
Rajat	8b8efa1b42	update spacy universe with my project (#5497 ) * added contextualSpellCheck in spacy universe meta * removed extra formatting by code * updated with permanent links * run json linter used by spacy * filled SCA * updated the description	2020-05-25 11:30:23 +02:00
Jannis	aa53ce6996	Documentation Typo Fix (#5492 ) * Fix typo Change 'realize' to 'realise' * Add contributer agreement	2020-05-22 19:50:26 +02:00
Matthew Honnibal	93c4d13588	Merge pull request #5264 from lfiedler/issue-5230 Fix ResourceWarnings during unittest	2020-05-22 00:31:07 +02:00
Kevin Lu	291b9ad7b9	Update CONTRIBUTOR_AGREEMENT.md	2020-05-19 20:29:53 -07:00
Kevin Lu	9a1a535215	Create kevinlu1248.md	2020-05-19 20:25:45 -07:00
Kevin Lu	a23b3a5a50	Update CONTRIBUTOR_AGREEMENT.md	2020-05-19 20:24:24 -07:00
Ines Montani	a41e28ceba	Merge pull request #5436 from ilivans/fix_errors_with_codes	2020-05-18 10:45:56 +02:00
Ilkyu Ju	72a25c9cef	Very minor issues in Korean example sentences (#5446 ) * Add contributor agreement * Improve ko translation of example sentences I fixed unnatural translations and word spacing errors. * Update osori.md	2020-05-17 13:43:34 +02:00
Ilia Ivanov	ee8fe37474	Add ilivans' contributor agreement	2020-05-14 15:59:06 +02:00
Vishnu Priya VR	9ce059dd06	Limiting noun_chunks for specific languages (#5396 ) * Limiting noun_chunks for specific langauges * Limiting noun_chunks for specific languages Contributor Agreement * Addressing review comments * Removed unused fixtures and imports * Add fa_tokenizer in test suite * Use fa_tokenizer in test * Undo extraneous reformatting Co-authored-by: adrianeboyd <adrianeboyd@gmail.com>	2020-05-14 12:58:06 +02:00
Travis Hoppe	d4cc18b746	Added author information for NLPre (#5414 ) * Add author links for NLPre and update category * Add contributor statement	2020-05-08 11:28:54 +02:00
Samuel Rodríguez Medina	8602daba85	Swedish like_num (#5371 ) * Sign contributor agreement. * Add like_num functionality to Swedish. * Update spacy/tests/lang/sv/test_lex_attrs.py Co-Authored-By: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update contributor agreement Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2020-04-29 21:25:22 +02:00
adrianeboyd	a6e521cd79	Add is_sent_end token property (#5375 ) Reconstruction of the original PR #4697 by @MiniLau. Removes unused `SENT_END` symbol and `IS_SENT_END` from `Matcher` schema because the Matcher is only going to be able to support `IS_SENT_START`.	2020-04-29 12:53:16 +02:00
Louis Guitton	a27c4014f5	Add mlflow to spaCy universe (#5352 ) * Add mlflow to universe * Use mlflow black logo	2020-04-29 10:18:03 +02:00
Michael	5b5528ff2e	Add `!=3.4.*` to python_requires (#5344 ) Missed in `80d554f2e2`	2020-04-27 22:02:09 +02:00
Punitvara	b2b7e1f37a	This PR adds Gujarati Language class along with (#5355 ) * This PR adds Gujarati Language class along with - stop words * Add test for gu tokenizer	2020-04-27 11:07:37 +02:00
sabiqueqb	fc91660aa2	Gh 5339 language class for malayalam (#5342 ) * Initialize Malayalam Language class * Add lex_attrs and examples for Malayalam * Add spaCy Contributor Agreement * Add test for ml tokenizer	2020-04-27 09:45:08 +02:00
Mike	481574cbc8	[minor doc change] embedding vis. link is broken in `website/docs/usage/examples.md` (#5325 ) * The embedding vis. link is broken The first link seems to be reasonable for now unless someone has an updated embedding vis they want to share? * contributor agreement * Update Mlawrence95.md * Update website/docs/usage/examples.md Co-Authored-By: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2020-04-21 20:35:12 +02:00
laszabine	fb73d4943a	Amend documentation to Language.evaluate (#5319 ) * Specified usage of arguments to Language.evaluate * Created contributor agreement	2020-04-16 20:00:18 +02:00
Jakob Jul Elben	663333c3b2	Fixes #5413 (#5315 ) * Fix 5314 * Add contributor * Resolve requested changes Co-authored-by: Jakob Jul Elben <jakob@datamaga.com>	2020-04-16 13:29:02 +02:00
Sébastien Harinck	dac70f29eb	contrib: add contributor agreement for user sebastienharinck (#5316 )	2020-04-16 11:32:09 +02:00
Paolo Arduin	1ca32d8f9c	Matcher support for Span as well as Doc (#5113 ) * Matcher support for Span, as well as Doc #5056 * Removes an import unused * Signed contributors agreement * Code optimization and better test * Add error message for bad Matcher call argument * Fix merging	2020-04-15 13:51:33 +02:00
Thomas Thiebaud	1eef60c658	Add spacy_fastlang to universe (#5271 ) * Add spacy_fastlang to universe * Sign SCA	2020-04-15 13:50:46 +02:00
Paolo Arduin	8ce408d2e1	Comparison predicate handling for `!=` (#5282 ) * Fix #5281 * Optim test	2020-04-14 19:14:15 +02:00
Marek Grzenkowicz	6a8a52650f	[Closes #5292 ] Fix typo in option name "--n-save_every" (#5293 ) * Sign contributor agreement for chopeen * Fix typo in option name and close #5292	2020-04-11 23:35:01 +02:00
Umar Butler	8952effcc4	Fixed Typo in Warning (#5284 ) * Fixed typo in cli warning Fixed a typo in the warning for the provision of exactly two labels, which have not been designated as binary, to textcat. * Create and signed contributor form	2020-04-09 15:46:15 +02:00

1 2 3 4 5 ...

595 Commits