Richard Hudson
8e55efcbd9
Check SUPPORTS_ANSI when rendering
2021-12-29 09:30:35 +01:00
Richard Hudson
08370604d3
Change order of imports
...
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-12-29 09:22:06 +01:00
Richard Hudson
678bc61086
Apply suggestions from code review
...
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-12-29 09:21:23 +01:00
Richard Hudson
e3e8495b41
Updated requirements.txt
2021-12-29 08:47:56 +01:00
Richard Hudson
92943f8a23
Removed unused import
2021-12-23 17:47:56 +01:00
Richard Hudson
2cae470180
More type corrections
2021-12-23 17:35:47 +01:00
Richard Hudson
106fb53509
More type corrections
2021-12-23 17:24:28 +01:00
Richard Hudson
5c850b2ac3
Corrected types
2021-12-23 17:01:43 +01:00
Richard Hudson
e713aa0938
Add surrounding tokens functionality
2021-12-23 16:13:40 +01:00
Richard Hudson
ed788c5def
Add render_instances function
2021-12-08 19:24:32 +01:00
Richard Hudson
bd00611259
Add render_text
2021-12-08 17:47:29 +01:00
Richard Hudson
49f3fd39b9
Refactoring
2021-12-08 16:42:39 +01:00
Richard Hudson
183d535ef4
Add permitted values
2021-12-08 14:58:02 +01:00
Richard Hudson
9f7f234b0f
Added tabular view
2021-12-08 14:30:38 +01:00
Richard Hudson
e04950ef3c
Fixed problems with non-projective trees
2021-12-07 12:04:41 +01:00
Richard Hudson
06a9939eb5
Add dependency tree tests
2021-11-30 17:06:55 +01:00
Richard Hudson
9a1d291191
Render sentences rather than documents
2021-11-30 16:01:05 +01:00
Richard Hudson
b4265eccf9
render_dependency_trees complete
2021-11-30 15:25:25 +01:00
Richard Hudson
a660a7d347
Work in process
2021-11-29 21:12:11 +01:00
Richard Hudson
2d0e916220
Work in progress
2021-11-28 15:39:07 +01:00
Richard Hudson
93a4905b25
Work in progress
2021-11-26 22:14:49 +01:00
Richard Hudson
9d97a0ff0c
Work in progress
2021-11-26 22:01:35 +01:00
Richard Hudson
c446d56ce8
Working version
2021-11-26 21:36:48 +01:00
Natalia Rodnova
a4c43e5c57
Allow Matcher to match on ENT_ID and ENT_KB_ID ( #9688 )
...
* Added ENT_ID and ENT_KB_ID into the list of the attributes that Matcher matches on
* Added ENT_ID and ENT_KB_ID to TEST_PATTERNS in test_pattern_validation.py. Disabled tests that I added before
* Update website/docs/api/matcher.md
* Format
* Remove skipped tests
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2021-11-24 10:37:10 +01:00
Richard Hudson
7fec5fd647
Merge pull request #9737 from Pantalaymon/patch-1
...
Create Pantalaymon.md
2021-11-24 09:56:43 +01:00
Valentin-Gabriel Soumah
0bbf86bba8
Create Pantalaymon.md
...
Submitting agreement to spacy in order to contribute to Coreferee project .
2021-11-23 17:29:23 +01:00
Duygu Altinok
a7d7e80adb
EntityRuler improve disk load error message ( #9658 )
...
* added error string
* added serialization test
* added more to if statements
* wrote file to tempdir
* added tempdir
* changed parameter a bit
* Update spacy/tests/pipeline/test_entity_ruler.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-11-23 16:26:05 +01:00
Adriane Boyd
9ac6d4991e
Add doc_cleaner component ( #9659 )
...
* Add doc_cleaner component
* Fix types
* Fix loop
* Rephrase method description
2021-11-23 15:33:33 +01:00
Adriane Boyd
a77f50baa4
Allow Scorer.score_spans to handle pred docs with missing annotation ( #9701 )
...
If the predicted docs are missing annotation according to
`has_annotation`, treat the docs as having no predictions rather than
raising errors when the annotation is missing.
The motivation for this is a combined tokenization+sents scorer for a
component where the sents annotation is optional. To provide a single
scorer in the component factory, it needs to be possible for the scorer
to continue despite missing sents annotation in the case where the
component is not annotating sents.
2021-11-23 15:17:19 +01:00
Adriane Boyd
36c7047946
Use reference parse to initialize parser moves ( #9722 )
2021-11-23 14:55:55 +01:00
Paul O'Leary McCann
52b8c2d2e0
Add note on batch contract for listeners ( #9691 )
...
* Add note on batch contract
Using listeners requires batches to be consistent. This is obvious if
you understand how the listener works, but it wasn't clearly stated in
the Docs, and was subtle enough that the EntityLinker missed it.
There is probably a clearer way to explain what the actual requirement
is, but I figure this is a good start.
* Rewrite to clarify role of caching
2021-11-22 11:06:07 +01:00
Sofie Van Landeghem
13645dcbf5
add note that annotating components is new since 3.1 ( #9678 )
2021-11-22 14:43:11 +09:00
Adriane Boyd
0e93b315f3
Convert labels to strings for README in package CLI ( #9694 )
2021-11-19 08:51:46 +01:00
Adriane Boyd
ea450d652c
Exclude strings from v3.2+ source vector checks ( #9697 )
...
Exclude strings from `Vector.to_bytes()` comparions for v3.2+ `Vectors`
that now include the string store so that the source vector comparison
is only comparing the vectors and not the strings.
2021-11-19 08:51:19 +01:00
Paul O'Leary McCann
f3981bd0c8
Clarify how to fill in init_tok2vec after pretraining ( #9639 )
...
* Clarify how to fill in init_tok2vec after pretraining
* Ignore init_tok2vec arg in pretraining
* Update docs, config setting
* Remove obsolete note about not filling init_tok2vec early
This seems to have also caught some lines that needed cleanup.
2021-11-18 15:38:30 +01:00
Vishnu Nandakumar
86fa37e8ba
Update universe.json with new library eng_spacysentiment ( #9679 )
...
* Update universe.json
* Update universe.json
* Cleanup fields
Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
2021-11-16 14:06:19 +09:00
Adriane Boyd
c9baf9d196
Fix spancat for empty docs and zero suggestions ( #9654 )
...
* Fix spancat for empty docs and zero suggestions
* Use ops.xp.zeros in test
2021-11-15 12:40:55 +01:00
github-actions[bot]
67d8c8a081
Auto-format code with black ( #9664 )
...
Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>
2021-11-12 10:00:03 +01:00
Sofie Van Landeghem
24cdd4c88e
Merge pull request #9638 from polm/fix/optional-pretrain-path
...
Make Jsonl Corpus reader path optional again
2021-11-09 10:45:14 +01:00
Paul O'Leary McCann
8aa2d32ca9
Update jsonlcorpus constructor types
2021-11-09 16:20:19 +09:00
Paul O'Leary McCann
71fb00ed95
Update spacy/training/corpus.py
...
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-11-08 10:02:29 +00:00
Sofie Van Landeghem
c97f29c593
Merge pull request #9629 from ljvmiranda921/chore/migrate-regressions
...
Migrate regression and other tests to the new pytest marker
2021-11-08 09:07:38 +01:00
Paul O'Leary McCann
141f12b92e
Make Jsonl Corpus reader optional again
2021-11-07 18:56:23 +09:00
Lj Miranda
909177589d
Remove utility script
2021-11-06 06:35:58 +08:00
Ines Montani
86af0234ab
Update version [ci skip]
2021-11-05 19:02:35 +01:00
Adriane Boyd
216ed231a9
What's new in v3.2 ( #9633 )
...
* What's new in v3.2
* Fix formatting
* Fix typo
* Redo thanks
* Formatting
* Fix typo
* Fix project links
* Fix typo
* Minimal intro, floret python module
* Rephrase
* Rephrase, extend
* Rephrase
* Update links and formatting [ci skip]
* Minor correction
* Fix typo
Co-authored-by: Ines Montani <ines@ines.io>
2021-11-05 16:31:14 +01:00
Adriane Boyd
0fc3dee772
Merge pull request #9596 from adrianeboyd/tests/reenable-v3.2.0-tests
...
Reenable tests for v3.2.0
2021-11-05 10:54:30 +01:00
github-actions[bot]
5cdb7eb5c2
Auto-format code with black ( #9631 )
...
Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2021-11-05 09:58:36 +01:00
Adriane Boyd
e6f91b6f27
Format ( #9630 )
2021-11-05 09:56:26 +01:00
Lj Miranda
8e7deaf210
Add missing imports in some regression tests
...
- test_issue7001-8000.py
- test_issue8190.py
2021-11-05 11:47:59 +08:00