spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-07-16 03:02:41 +03:00

Author	SHA1	Message	Date
Ines Montani	f4f46b617f	Preserve sourced components in fill-config (fixes #7055 ) (#7058 )	2021-02-14 14:02:14 +11:00
Matthew Honnibal	0fb8d437c0	Fix sentence fragments bug (#7056 , #7035 ) (#7057 ) * Add test for #7035 * Update test for issue 7056 * Fix test * Fix transitions method used in testing * Fix state eol detection when rebuffer * Clean up redundant fix	2021-02-14 13:38:13 +11:00
Ines Montani	660642902a	Increment version [ci skip]	2021-02-14 13:36:13 +11:00
Matthew Honnibal	b31471b5b8	Set version to v3.0.2	2021-02-13 23:50:00 +11:00
Ines Montani	9ba715ed16	Tidy up and auto-format	2021-02-13 12:55:56 +11:00
Ines Montani	34ee0fbd70	Merge pull request #7011 from Shumie82/master	2021-02-13 12:30:42 +11:00
Ines Montani	e583050547	Merge pull request #7039 from svlandeg/debug	2021-02-13 11:53:41 +11:00
Ines Montani	6c450decfc	Fix punctuation settings and add to initialize tests	2021-02-13 11:51:21 +11:00
Ines Montani	f4712a634e	Merge pull request #7046 from adrianeboyd/bugfix/vocab-pickle-noun-chunks-6891 Include noun chunks method when pickling Vocab	2021-02-13 11:43:03 +11:00
Adriane Boyd	0ee2ae86bf	Update trf quickstart recommendations Add/update trf recommendations for Bengali, Hindi, Sinhala, and Tamil based on #7044.	2021-02-12 15:55:17 +01:00
svlandeg	03b4ec7d7f	fix typo	2021-02-12 14:30:16 +01:00
Adriane Boyd	5e47a54d29	Include noun chunks method when pickling Vocab	2021-02-12 13:27:46 +01:00
svlandeg	aa3ad8825d	loop instead of any	2021-02-12 13:14:30 +01:00
svlandeg	278e9eaa14	remove ner	2021-02-11 21:08:04 +01:00
svlandeg	ebeedfc70b	regression test for 7029	2021-02-11 20:56:48 +01:00
svlandeg	a52d466bfc	any instead of all	2021-02-11 20:50:55 +01:00
Shumi	4e514f1ea8	Update stop_words.py I have deleted line 1 to 5 and the statement print(STOP_WORDS)	2021-02-11 21:30:34 +02:00
Shumi	0d57e84b7b	Update lex_attrs.py I have removed line 1 to 4	2021-02-11 21:28:23 +02:00
Shumi	37ec67f868	Update examples.py I have removed two lines: # coding: utf8 from __future__ import unicode_literals And updated: >>> from spacy.lang.tn.examples import sentences	2021-02-11 21:25:58 +02:00
Shumi	39eeba6760	Update __init__.py Added infixes = TOKENIZER_INFIXES	2021-02-11 21:20:46 +02:00
Ines Montani	26bf642afd	Fix issue #7019 : Handle None scores in evaluate printer (#7026 )	2021-02-11 16:45:23 +11:00
Ines Montani	6b9026a219	Merge pull request #7000 from explosion/feature/project-yml-overrides Support env vars and CLI overrides for project.yml	2021-02-11 12:31:45 +11:00
Ines Montani	ad9ce3c8f6	Fix issue #6950 : allow pickling Tok2Vec with listeners	2021-02-11 11:37:39 +11:00
Shumi	ed3397727e	Delete tag_map.py Tag map file is deleted. I will add it later because it was failing validations	2021-02-10 20:41:18 +02:00
Shumi	7c8721b1bd	Update tag_map.py Updated tag_map	2021-02-10 20:21:22 +02:00
Shumi	f6be28cfb2	Added files to Setswana Language Add South African Setswana Language	2021-02-10 20:15:13 +02:00
Shumi	24046fef17	South African Setswana language Please accept the additional of Setswana language	2021-02-10 20:12:33 +02:00
Peter Baumann	61b04a70d5	Run PhraseMatcher on Spans (#6918 ) * Add regression test * Run PhraseMatcher on Spans * Add test for PhraseMatcher on Spans and Docs * Add SCA * Add test with 3 matches in Doc, 1 match in Span * Update docs * Use doc.length for find_matches in tokenizer Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2021-02-10 23:43:32 +11:00
Ines Montani	21176c69b0	Update and add test	2021-02-10 14:12:00 +11:00
Ines Montani	c08b3f294c	Support env vars and CLI overrides for project.yml	2021-02-10 13:45:27 +11:00
Koichi Yasuoka	8ed788660b	Several callable objects do not have __qualname__	2021-02-09 14:43:02 +09:00
Adriane Boyd	6108dabdc8	Rephrase error related to sample data initialization Now that the initialize step is fully implemented, the source of E923 is typically missing or improperly converted/formatted data rather than a bug in spaCy, so rephrase the error and message and remove the prompt to open an issue.	2021-02-08 09:21:36 +01:00
Sofie Van Landeghem	6ed423c16c	reduce memory load when reading all vectors from file (#6945 ) * reduce memory load when reading all vectors from file * one more small typo fix	2021-02-07 08:05:43 +08:00
Sofie Van Landeghem	a323ef90df	ensure the loss value is cast as float (#6928 )	2021-02-07 07:51:56 +08:00
melonwater211	a7977b5143	The test `spacy/tests/vocab_vectors/test_lexeme.py::test_vocab_lexeme_add_flag_auto_id` seems to fail occasionally when the test suite is run in a random order. (#6956 ) ```python def test_vocab_lexeme_add_flag_auto_id(en_vocab): is_len4 = en_vocab.add_flag(lambda string: len(string) == 4) assert en_vocab["1999"].check_flag(is_len4) is True assert en_vocab["1999"].check_flag(IS_DIGIT) is True assert en_vocab["199"].check_flag(is_len4) is False > assert en_vocab["199"].check_flag(IS_DIGIT) is True E assert False is True E + where False = <built-in method check_flag of spacy.lexeme.Lexeme object at 0x7fa155c36840>(3) E + where <built-in method check_flag of spacy.lexeme.Lexeme object at 0x7fa155c36840> = <spacy.lexeme.Lexeme object at 0x7fa155c36840>.check_flag spacy/tests/vocab_vectors/test_lexeme.py:49: AssertionError ``` > `pytest==6.1.1` > > `numpy==1.19.2` > > `Python version: 3.8.3` To reproduce the error, run `pytest --random-order-bucket=global --random-order-seed=170158 -v spacy/tests` If `test_vocab_lexeme_add_flag_auto_id` is run after `test_vocab_lexeme_add_flag_provided_id`, it fails. It seems like `test_vocab_lexeme_add_flag_provided_id` uses the `IS_DIGIT` bit for testing purposes but does not reset the bit. This solution seems to work but, if anyone has a better fix, please let me know and I will integrate it.	2021-02-07 07:51:34 +08:00
René Octavio Queiroz Dias	59271e887a	fix: TransformerListener with TextCatEnsemble (#6951 ) * bug: Regression test Issue #6946 * fix: Fix issue #6946 * chore: Remove regression test	2021-02-06 13:44:51 +01:00
René Octavio Queiroz Dias	999ff03b19	fix: Fix textcat labels to expect a Optional[Iterable[str]] instead of Optional[Dict] (#6911 ) * docs: Add agreement * bug: Regression test Issue #6908 * fix: Changed from Dict to Iterable[str] Fix #6908 * Update test to use make_tempdir * fix: Fix WindowsPath error Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2021-02-04 23:37:13 +01:00
Adriane Boyd	b903de3fcb	Pass on vocab arg in spacy.blank() (#6924 )	2021-02-04 15:09:01 +01:00
svlandeg	f852af2acf	add capture arg	2021-02-02 19:47:12 +01:00
Matthew Honnibal	b6a198481b	Set version to v3.0.0	2021-02-02 20:26:17 +11:00
Sofie Van Landeghem	f319d2765f	Add capture argument to project_run (#6878 ) * add capture argument to project_run and run_commands * git bump to 3.0.1 * Set version to 3.0.1.dev0 Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>	2021-02-02 10:11:15 +08:00
Sofie Van Landeghem	f638306598	remove link_components flag again (#6883 )	2021-02-02 10:08:40 +08:00
Ines Montani	a59f3fcf5d	Make wheel the default format and update docs [ci skip]	2021-02-01 23:18:43 +11:00
Ines Montani	b9573e9e22	Fix pip args	2021-02-01 23:15:00 +11:00
Ines Montani	b46073234a	Fix default clone branch and error handling [ci skip]	2021-02-01 22:29:04 +11:00
Sofie Van Landeghem	acabb284dd	Fix linking resumed components (#6859 ) * link components across enabled, resumed and frozen * revert renaming * revert renaming, the sequel	2021-02-01 22:19:58 +11:00
Adriane Boyd	35a863cd27	Remove nlp.tokenizer from quickstart template Remove `nlp.tokenizer` from quickstart template so that the default language-specific tokenizer settings are filled instead.	2021-02-01 11:20:12 +01:00
svlandeg	91e72c031e	reformatting	2021-01-30 17:29:33 +01:00
svlandeg	a8d84188f0	add stop words Co-authored-by: tewodrosm <tedmaam2006@gmail.com>	2021-01-30 17:26:49 +01:00
Ines Montani	f058cbd751	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2021-01-30 21:03:25 +11:00

1 2 3 4 5 ...

8520 Commits