Commit Graph

17 Commits

Author SHA1 Message Date
svlandeg
2f88c4ef09 remove unncessary fmt:off instructions 2023-07-26 13:47:55 +02:00
svlandeg
84e7d05a0b remove unused import 2023-07-26 13:24:29 +02:00
svlandeg
0278eecabf Merge branch 'upstream_master' into test-cli-app-init-config 2023-07-26 13:23:30 +02:00
Daniël de Kok
e2b70df012
Configure isort to use the Black profile, recursively isort the spacy module (#12721)
* Use isort with Black profile

* isort all the things

* Fix import cycles as a result of import sorting

* Add DOCBIN_ALL_ATTRS type definition

* Add isort to requirements

* Remove isort from build dependencies check

* Typo
2023-06-14 17:48:41 +02:00
Sofie Van Landeghem
d65e3c31a6
use system-independent commands (#12693) 2023-06-08 11:43:36 +02:00
svlandeg
add6de2fa9 Merge branch 'master' into test-cli-app-init-config 2023-06-01 17:45:36 +02:00
Adriane Boyd
f27bce67fd
Skip project clone tests if git is not available (#12394) 2023-03-09 16:41:21 +01:00
Peter Baumgartner
f6108776aa fix formatting on examples 2023-02-23 10:17:23 -05:00
Peter Baumgartner
d37b2094f7 pull out parameter example data 2023-02-23 09:56:07 -05:00
Paul O'Leary McCann
1e8bac99f3
Add tests for projects to master (#12303)
* Add tests for projects to master

* Fix git clone related issues on Windows

* Add stat import
2023-02-23 10:22:57 +01:00
Peter Baumgartner
35f22ba211 add combo test 2023-01-30 14:10:20 -05:00
Adriane Boyd
606273f7e4
Normalize whitespace in evaluate CLI output test (#12157)
* Normalize whitespace in evaluate CLI output test

Depending on terminal settings, lines may be padded to the screen width
so the comparison is too strict with only the command string replacement.

* Move to test util method

* Change to normalization method
2023-01-27 16:13:34 +01:00
Peter Baumgartner
c68e6b8a96
trainable_lemmatizer in debug data (#11419)
* WIP

* rm ipython embeds

* rm total

* WIP

* cleanup

* cleanup + reword

* rm component function

* remove migration support form

* fix reference dataset for dev data

* additional fixes

- set approach to identifying unique trees
- adjust line length on messages
- add logic for detecting docs without annotations

* use 0 instead of none for no annotation

* partial annotation support

* initial tests for _compile_gold lemma attributes

Using the example data from the edit tree lemmatizer tests for:
- lemmatizer_trees
- partial_lemma_annotations
- n_low_cardinality_lemmas
- no_lemma_annotations

* adds output test for cli app

* switch msg level

* rm unclear uniqueness check

* Revert "rm unclear uniqueness check"

This reverts commit 6ea2b3524b.

* remove good message on uniqueness

* formatting

* use en_vocab fixture

* clarify data set source in messages

* remove unnecessary import

Co-authored-by: svlandeg <svlandeg@github.com>
2023-01-26 17:36:50 +01:00
Peter Baumgartner
e4183ca354 add fixture 2023-01-24 14:06:56 -05:00
Peter Baumgartner
17c4bfc181 initial test commit 2023-01-23 15:28:42 -05:00
Daniël de Kok
319eb508b5
Add a spacy benchmark speed subcommand (#11902)
* Add a `spacy evaluate speed` subcommand

This subcommand reports the mean batch performance of a model on a data set with
a 95% confidence interval. For reliability, it first performs some warmup
rounds. Then it will measure performance on batches with randomly shuffled
documents.

To avoid having too many spaCy commands, `speed` is a subcommand of `evaluate`
and accuracy evaluation is moved to its own `evaluate accuracy` subcommand.

* Fix import cycle

* Restore `spacy evaluate`, make `spacy benchmark speed` an alias

* Add documentation for `spacy benchmark`

* CREATES -> PRINTS

* WPS -> words/s

* Disable formatting of benchmark speed arguments

* Fail with an error message when trying to speed bench empty corpus

* Make it clearer that `benchmark accuracy` is a replacement for `evaluate`

* Fix docstring webpage reference

* tests: check `evaluate` output against `benchmark accuracy`
2023-01-12 11:55:21 +01:00
Sofie Van Landeghem
7f6c638c3a
fix processing of "auto" in convert (#12050)
* fix processing of "auto" in walk_directory

* add check for None

* move AUTO check to convert and fix verification of args

* add specific CLI test with CliRunner

* cleanup

* more cleanup

* update docstring
2023-01-05 10:21:00 +01:00