spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-12-11 12:14:30 +03:00

Author	SHA1	Message	Date
ICLR&D	87e40b17a0	Add entry for Blackstone in universe.json (#4101 ) * Add entry for Blackstone in universe.json Add an entry for the Blackstone project. Checked JSON is valid. * Create ICLRandD.md * Fix indentation (tabs to spaces) It looks like during validation, the JSON file automatically changed spaces to tabs. This caused the diff to show everything as changed, which is obviously not true. This hopefully fixes that. * Try to fix formatting for diff * Fix diff Co-authored-by: Ines Montani <ines@ines.io>	2019-08-09 17:16:51 +02:00
Sofie Van Landeghem	963ea5e8d0	Update lemma and vector information after splitting a token (#4097 ) * fixing vector and lemma attributes after retokenizer.split * fixing unit test with mockup tensor * xp instead of numpy	2019-08-08 15:09:44 +02:00
Ines Montani	a2ac2e873f	Update Binder version [ci skip]	2019-08-08 13:03:45 +02:00
Matthew Honnibal	04113a844d	Set version to v2.1.8	2019-08-07 13:53:58 +02:00
Ines Montani	36ac044937	Update README.md [ci skip]	2019-08-07 13:38:59 +02:00
Ines Montani	3e60afacf9	Add Serbian to languages [ci skip]	2019-08-07 13:38:25 +02:00
Ines Montani	1dc28a9ecb	Update Binder version [ci skip]	2019-08-07 13:38:12 +02:00
Ines Montani	6bec24cdd0	Require downloaded model in pkg_resources (#4090 )	2019-08-07 13:18:11 +02:00
Ines Montani	8b4a0fabbb	Adjust docs example [ci skip]	2019-08-07 00:46:47 +02:00
adrianeboyd	69aca7d839	Add validate option to EntityRuler (#4089 ) * Add validate option to EntityRuler * Add validate to EntityRuler, passed to Matcher and PhraseMatcher * Add validate to usage and API docs * Update website/docs/usage/rule-based-matching.md Co-Authored-By: Ines Montani <ines@ines.io> * Update website/docs/usage/rule-based-matching.md Co-Authored-By: Ines Montani <ines@ines.io>	2019-08-07 00:40:53 +02:00
Ines Montani	4ae320e5c2	Use consistent casing for entity ruler patterns (see #4063 ) [ci skip]	2019-08-06 12:20:22 +02:00
Ines Montani	223bde5cf6	Improve docs on matcher attributes [ci skip] (closes #4063 )	2019-08-06 12:13:42 +02:00
Ines Montani	2bfae0b167	Auto-format	2019-08-06 12:13:31 +02:00
Jeno	15be09ceb0	Raise error if annotation dict in simple training style has unexpected keys #4074 (#4079 ) * adding enhancement #4074. * modified behavior to strictly require top level dictionary keys - issue #4074 * pass expected keys to error message and add links as expected top level key	2019-08-06 11:01:25 +02:00
Sofie Van Landeghem	ad09b0d6f3	fetch norm from lex if necessary for matching (#4080 )	2019-08-05 23:51:04 +02:00
Ines Montani	7f3212e2f5	💫 Sync branches (#4084 ) [ci skip] * Update from master * Re-added Universe readme (#3688) (closes #3680) * Fix typo * Add version tag to `--base-model` argument (closes #3720) * fixing regex matcher examples (#3708) (#3719) * Improve Token.prob and Lexeme.prob docs (resolves #3701) * Fix DependencyParser.predict docs (resolves #3561) * Update languages.json Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be> Co-authored-by: Aaron Kub <aaronkub@gmail.com>	2019-08-05 14:32:54 +02:00
Ines Montani	0f740fad1a	Update universe.json [ci skip]	2019-08-05 14:30:07 +02:00
Pavle Vidanović	e1a935d71c	Stopwords for Serbian language. (#4078 ) * Serbian stopwords added. (cyrillic alphabet) * spaCy Contribution agreement included. * Test initialize updated	2019-08-05 10:22:27 +02:00
Sebastian Jordan	878302a55d	Fix typo in requirements section of pyproject.toml (#4081 )	2019-08-05 10:21:14 +02:00
veer-bains	874bd8c8dd	Fixed syntax error in lang/ko when using python 2 (#4082 ) (closes #4068 ) * fixed syntax error in declaring variables with python 2.7 in spacy/lang/ko/__init__.py * fixed syntax error in declaring variables with python 2.7 in spacy/lang/ko/__init__.py * Update __init__.py * Create veer-bains.md * Update __init__.py fixed syntax errors in variable datatype assignment when calling spacy.blank("ko") with python 2.7	2019-08-05 10:19:32 +02:00
Ines Montani	87ddbdc33e	Fix handling of kwargs in Language.evaluate Makes it consistent with other methods	2019-08-04 13:44:21 +02:00
Muhammad Irfan	d1d30b0442	added missing punctuation following conventions. (#4066 )	2019-08-04 13:41:18 +02:00
Anastassia	33b14724a5	Update gold corpus code to properly ingest a directory of jsonl… (#4067 ) * Update gold corpus code to properly ingest a directory of jsonlines files In response to: https://github.com/explosion/spaCy/issues/3975 * Update spacy/gold.pyx Co-Authored-By: Ines Montani <ines@ines.io>	2019-08-02 09:58:51 +02:00
Ines Montani	0f76e0022d	Update .tensor docs [ci skip]	2019-08-01 18:37:09 +02:00
Ines Montani	3072eb28c2	Support and render Markdown in model meta [ci skip]	2019-08-01 18:33:10 +02:00
Matthew Honnibal	944a66c326	Add span.tensor and token.tensor attributes	2019-08-01 18:30:50 +02:00
Matthew Honnibal	d3071ecdbc	Set version to v2.1.7	2019-08-01 18:09:19 +02:00
Matthew Honnibal	97c51ef93b	Set version to v2.1.7.dev1	2019-08-01 17:29:25 +02:00
Matthew Honnibal	4632c597e7	Fix Pipe base class	2019-08-01 17:29:01 +02:00
Ines Montani	8718ca8b1f	Fix init_model if there's no vocab (closes #4048 ) (#4049 )	2019-08-01 17:26:09 +02:00
adrianeboyd	925a852bb6	Improve NER per type scoring (#4052 ) * Improve NER per type scoring * include all gold labels in per type scoring, not only when recall > 0 * improve efficiency of per type scoring * Create Scorer tests, initially with NER tests * move regression test #3968 (per type NER scoring) to Scorer tests * add new test for per type NER scoring with imperfect P/R/F and per type P/R/F including a case where R == 0.0	2019-08-01 17:15:36 +02:00
Sofie Van Landeghem	f7d950de6d	ensure the lang of vocab and nlp stay consistent (#4057 ) * ensure the language of vocab and nlp stay consistent across serialization * equality with =	2019-08-01 17:13:01 +02:00
Björn Böing	a83c0add2e	Add links to tokenizer API docs to refer relevant information. (#4064 ) * Add links to tokenizer API docs to refer relevant information. * Add suggested changes Co-Authored-By: Ines Montani <ines@ines.io>	2019-08-01 14:28:38 +02:00
Ejar	2cdf7d39e7	Corrected imported fucntion (#4062 ) The example showed an incorrected import	2019-08-01 12:43:36 +02:00
Mohammed Daudali	23ec07debd	Correct typo for AllenAI url on homepage (#4050 ) * Typo fix for AllenAI url Changed incorrect home page url for AllenAI from appenai.org to allenai.org * Sign contributor agreement * Change date format	2019-07-31 00:16:33 +02:00
Sofie Van Landeghem	7de3b129ab	Resolve edge case when calling textcat.predict with empty doc (#4035 ) * resolve edge case where no doc has tokens when calling textcat.predict * more explicit value test	2019-07-30 14:58:01 +02:00
Ines Montani	fcd2f7f656	Fix version introducing Span.ents (closes #4045 ) [ci skip]	2019-07-30 10:32:33 +02:00
Matthew Honnibal	89c92c65fb	Update version	2019-07-28 17:56:38 +02:00
Matthew Honnibal	06eb428ed1	Make pipe base class a bit less presumptuous	2019-07-28 17:56:11 +02:00
Matthew Honnibal	16b5144095	Don't raise NotImplemented in Pipe.update	2019-07-28 17:54:11 +02:00
Ines Montani	fc69da0acb	💫 Support simple training format in nlp.evaluate and add tests (#4033 ) * Support simple training format in nlp.evaluate and add tests * Update docs [ci skip]	2019-07-27 17:30:18 +02:00
Ines Montani	a3723f439c	Fix formatting [ci skip]	2019-07-27 16:35:42 +02:00
Ines Montani	d5bce35fb1	Fix bug in Span.similarity when called via hook	2019-07-27 15:33:27 +02:00
Ines Montani	109b5e1798	Fix bug in Token.similarity when called via hook	2019-07-27 15:26:01 +02:00
Ines Montani	e000b5ed82	Also support "requirements" in model.json	2019-07-27 13:34:57 +02:00
Ines Montani	307ffe472d	Support custom language factory setting in meta.json (#4031 )	2019-07-27 13:17:43 +02:00
Ines Montani	b7cd58c736	Tidy up and auto-format [ci skip]	2019-07-27 12:19:35 +02:00
Bae Yong-Ju	05fbf5d976	Fix error when Korean text contains regexp special characters. (#4022 )	2019-07-25 17:53:33 +02:00
Ines Montani	bd39e5e630	Add "Processing text" section [ci skip]	2019-07-25 17:38:03 +02:00
Ines Montani	a5e3d2f318	Improve section on disabling pipes [ci skip]	2019-07-25 14:25:34 +02:00

1 2 3 4 5 ...

10467 Commits