spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-09-18 18:12:45 +03:00

Author	SHA1	Message	Date
Ryn Daniels	f64e39fa49	Install explosionbot as a github action (#9420 )	2021-10-11 15:43:27 +02:00
Paul O'Leary McCann	efe5beefe0	Add test for case where parser overwrite annotations (#9406 ) * Add test for case where parser overwrite annotations * Move test to its own file Also add note about how other tokens modify results. * Fix xfail decorator	2021-10-11 14:57:45 +02:00
Ines Montani	1fa7c4e73b	Support issue marker via pytest	2021-10-11 13:56:24 +02:00
Paul O'Leary McCann	3b429619a8	Fix UD POS docs links (fix #9013 ) (#9407 ) * Fix UD POS docs links (fix #9013) The previous link seems to have been for UD v1. * Fix link	2021-10-11 11:51:59 +02:00
Paul O'Leary McCann	b53e39455e	Fix UD POS docs links (fix #9013 ) (#9407 ) * Fix UD POS docs links (fix #9013) The previous link seems to have been for UD v1. * Fix link	2021-10-11 11:51:19 +02:00
Paul O'Leary McCann	fd759a881b	Fix inconsistent lemmas (#9405 ) * Add util function to unique lists and preserve order * Use unique function instead of list(set()) list(set()) has the issue that it's not consistent between runs of the Python interpreter, so order can vary. list(set()) calls were left in a few places where they were behind calls to sorted(). I think in this case the calls to list() can be removed, but this commit doesn't do that. * Use the existing pattern for this	2021-10-11 11:38:45 +02:00
Adriane Boyd	fd91e6a33c	Fix types descriptions of sm and sent models (#9401 )	2021-10-11 11:18:10 +02:00
Adriane Boyd	fd7edbc645	Fix types descriptions of sm and sent models (#9401 )	2021-10-11 11:17:18 +02:00
Adriane Boyd	bbe4d3300a	Remove traces of lexemes from vocab serialization (#9400 )	2021-10-11 11:15:51 +02:00
Sofie Van Landeghem	a6ac36bcb3	Doc fixes in convert API (#9350 ) * add more info on the spacy debug command * formatting	2021-10-11 11:15:20 +02:00
Adriane Boyd	a5231cb044	Remove traces of lexemes from vocab serialization (#9400 )	2021-10-11 11:13:35 +02:00
Jette16	3b144a3a51	Add universe test (#9278 ) * Added test for universe.json * Added contributor agreement * Ran black on test_universe_json.py	2021-10-11 11:08:46 +02:00
Ines Montani	5003a9c3c7	Move core training logic in CLI into standalone function (#9398 )	2021-10-11 10:56:14 +02:00
Paul O'Leary McCann	2a7e327310	Fix Dependency Matcher Ordering Issue (#9337 ) * Fix inconsistency This makes the failing test pass, so that behavior is consistent whether patterns are added in one call or two. The issue is that the hash for patterns depended on the index of the pattern in the list of current patterns, not the list of total patterns, so a second call would get identical match ids. * Add illustrative test case * Add failing test for remove case Patterns are not removed from the internal matcher on calls to remove, which causes spurious weird matches (or misses). * Fix removal issue Remove patterns from the internal matcher. * Check that the single add call also gets no matches	2021-10-11 10:26:13 +02:00
Paul O'Leary McCann	5dbe4e8392	Update new issue config with Python 3.10 info Also adds note that Install issues go to Discussions.	2021-10-11 15:41:32 +09:00
Paul O'Leary McCann	48ba4e60f4	Add new style citation file (#9388 )	2021-10-07 17:47:39 +02:00
Sofie Van Landeghem	f87ae3cb7d	Doc fixes in convert API (#9350 ) * add more info on the spacy debug command * formatting	2021-10-06 13:13:18 +09:00
Adriane Boyd	4192e71599	Sync vocab in vectors and components sourced in configs (#9335 ) Since a component may reference anything in the vocab, share the full vocab when loading source components and vectors (which will include `strings` as of #8909). When loading a source component from a config, save and restore the vocab state after loading source pipelines, in particular to preserve the original state without vectors, since `[initialize.vectors] = null` skips rather than resets the vectors. The vocab references are not synced for components loaded with `Language.add_pipe(source=)` because the pipelines are already loaded and not necessarily with the same vocab. A warning could be added in `Language.create_pipe_from_source` that it may be necessary to save and reload before training, but it's a rare enough case that this kind of warning may be too noisy overall.	2021-10-04 12:19:02 +02:00
Paul O'Leary McCann	23badbd55c	Updating Troubleshooting Docs (#9329 ) * Add link to Discussions FAQ * Remove old FAQ entries I think these are no longer relevant. - no-cache-dir: affected pip versions are very old now - narrow unicode: not an issue from py3.3+ - utf-8 osx: upstream bug closed in 2019 Some of the other issues are also maybe not frequent.	2021-10-01 12:31:41 +02:00
Paul O'Leary McCann	6e833b617a	Updating Troubleshooting Docs (#9329 ) * Add link to Discussions FAQ * Remove old FAQ entries I think these are no longer relevant. - no-cache-dir: affected pip versions are very old now - narrow unicode: not an issue from py3.3+ - utf-8 osx: upstream bug closed in 2019 Some of the other issues are also maybe not frequent.	2021-10-01 12:28:22 +02:00
github-actions[bot]	42a76c758f	Auto-format code with black (#9346 ) Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>	2021-10-01 11:17:11 +02:00
Adriane Boyd	b3192ddea3	Sync thinc install dep in setup, fix test packaging (#9336 ) * Sync thinc install dep in setup * Add __init__.py to include package tests in package * Include *.toml in package	2021-09-30 19:02:10 +02:00
Paul O'Leary McCann	0508795d67	Fix invalid json	2021-09-30 15:24:47 +09:00
Paul O'Leary McCann	78a88f7de7	Fix invalid json	2021-09-30 15:23:55 +09:00
Martin Vallone	f15bb40941	Adding PhruzzMatcher to spaCy universe (#9321 ) * Adding PhruzzMatcher to spaCy universe * Fixes to make the package work properly	2021-09-30 14:26:40 +09:00
Martin Vallone	a14ab7e882	Adding PhruzzMatcher to spaCy universe (#9321 ) * Adding PhruzzMatcher to spaCy universe * Fixes to make the package work properly	2021-09-30 13:46:53 +09:00
Adriane Boyd	e750c1760c	Restore tokenization timing in Language.evaluate (#9305 ) Restore tokenization timing steps that were accidentally removed in #6765.	2021-09-27 20:44:14 +02:00
Sofie Van Landeghem	a361df00cd	Raise E983 early on in docbin init (#9247 ) * raise E983 early on in docbin init * catch situation before error is raised * add more info on the spacy debug command	2021-09-27 20:43:03 +02:00
Adriane Boyd	effae12cbd	Update slow readers test to use textcat_multilabel (#9300 )	2021-09-27 20:04:02 +02:00
github-actions[bot]	4da2af4e0e	Auto-format code with black (#9284 ) Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>	2021-09-24 10:46:43 +02:00
Ines Montani	6bb0324b81	Adjust kb_id visualizer templating and docs	2021-09-23 11:59:02 +02:00
Ines Montani	beb4a8c524	Merge pull request #9199 from shigapov/master (resolves #9129 )	2021-09-23 19:41:53 +10:00
Philip Vollet	d2adfe1efa	Add projects to spaCy Universe (#9269 ) * Added spaCy Universe projects * Added user license agreement Philip Vollet * Update website/meta/universe.json Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/meta/universe.json Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/meta/universe.json Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-09-23 10:56:45 +02:00
Ines Montani	57b5fc1995	Apply suggestions from code review Co-authored-by: Renat Shigapov <57352291+shigapov@users.noreply.github.com>	2021-09-23 17:58:32 +10:00
Sofie Van Landeghem	3fc3b7a13a	avoid crash when unicode in title (#9254 )	2021-09-22 21:01:34 +02:00
Daniël de Kok	17802836be	Allow overriding vars in the project assets subcommand (#9248 ) This change makes the `project assets` subcommand accept variables to override as well, making the interface more similar to `project run`.	2021-09-21 10:49:45 +02:00
Adriane Boyd	00bdb31150	Fix vector for 0-length span (#9244 )	2021-09-20 20:22:49 +02:00
svlandeg	ec621e6853	Merge remote-tracking branch 'upstream/master' into spacy.io	2021-09-20 15:54:00 +02:00
svlandeg	e0e3e9653b	Revert "raise E983 early on in docbin init" This reverts commit `f3f7afa21f`.	2021-09-20 15:52:02 +02:00
svlandeg	f3f7afa21f	raise E983 early on in docbin init	2021-09-20 15:49:31 +02:00
github-actions[bot]	015d439eb6	Auto-format code with black (#9234 ) Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>	2021-09-20 08:49:19 +02:00
Edward	79c7c62970	Update Hammurabi example code to v3 (#9218 ) * Update Hammurabi example code * Fix typo	2021-09-16 13:35:00 +02:00
Edward	8bda39f088	Update Hammurabi example code to v3 (#9218 ) * Update Hammurabi example code * Fix typo	2021-09-16 13:32:44 +02:00
Paul O'Leary McCann	c4f0800fb8	Validate pos values when creating Doc (#9148 ) * Validate pos values when creating Doc * Add clear error when setting invalid pos This also changes the error language slightly. * Fix variable name * Update spacy/tokens/doc.pyx * Test that setting invalid pos raises an error Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-09-16 13:28:05 +02:00
Jozef Harag	865cfbc903	feat: add `spacy.WandbLogger.v3` with optional `run_name` and `entity` parameters (#9202 ) * feat: add `spacy.WandbLogger.v3` with optional `run_name` and `entity` parameters * update versioning in docs Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>	2021-09-16 12:26:41 +02:00
Sofie Van Landeghem	00836c2d7d	Update spacy/displacy/templates.py	2021-09-16 09:23:21 +02:00
Sofie Van Landeghem	4bf2606adf	Update spacy/displacy/render.py Co-authored-by: Renat Shigapov <57352291+shigapov@users.noreply.github.com>	2021-09-16 09:22:38 +02:00
Paul O'Leary McCann	fd99438fb2	Make docs consistent (fix #9126 )	2021-09-16 15:56:19 +09:00
Paul O'Leary McCann	1d57d78758	Make docs consistent (fix #9126 )	2021-09-16 15:54:12 +09:00
Paul O'Leary McCann	9ceb8f413c	StringStore/Vocab dev docs (#9142 ) * First take at StringStore/Vocab docs Things to check: 1. The mysterious vocab members 2. How to make table of contents? Is it autogenerated? 3. Anything I missed / needs more detail? * Update docs * Apply suggestions from code review Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Updates based on review feedback * Minor fix * Move example code down Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2021-09-16 12:50:22 +09:00

1 2 3 4 5 ...

15083 Commits