Commit Graph

446 Commits

Author SHA1 Message Date
Adriane Boyd
e9f7f9a4bc Fix is_cython_func for additional imported code
* Fix `is_cython_func` for imported code loaded under `python_code`
module name
* Add `make_named_tempfile` context manager to test utils to test
loading of imported code
* Add test for validation of `initialize` params in custom module
2021-03-01 16:37:39 +01:00
Adriane Boyd
a3293efc48 Add time and level to default logging formatter 2021-02-15 14:19:20 +01:00
Ines Montani
21176c69b0 Update and add test 2021-02-10 14:12:00 +11:00
Koichi Yasuoka
8ed788660b
Several callable objects do not have __qualname__ 2021-02-09 14:43:02 +09:00
Sofie Van Landeghem
f319d2765f
Add capture argument to project_run (#6878)
* add capture argument to project_run and run_commands

* git bump to 3.0.1

* Set version to 3.0.1.dev0

Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2021-02-02 10:11:15 +08:00
Ines Montani
325f47500d Move replacement logic to Language.from_config 2021-01-29 19:37:04 +11:00
Ines Montani
01ecfbcc45 Merge branch 'develop' into feature/replace-listeners 2021-01-29 15:57:32 +11:00
Ines Montani
911dfcccfc Add option to replace listeners for sourced components 2021-01-29 15:57:04 +11:00
Sofie Van Landeghem
837a4f53c2
Error handling in nlp.pipe (#6817)
* add error handler for pipe methods

* add unit tests

* remove pipe method that are the same as their base class

* have Language keep track of a default error handler

* cleanup

* formatting

* small refactor

* add documentation
2021-01-29 08:51:21 +08:00
Ines Montani
f4d547b73c Fix error code 2021-01-18 11:43:45 +11:00
Ines Montani
a552db2819 Include available registry names in error 2021-01-16 14:35:03 +11:00
Ines Montani
d12be459f6 Raise RegistryError 2021-01-16 12:57:13 +11:00
Ines Montani
a203e3dbb8 Support spacy-legacy via the registry 2021-01-15 21:42:40 +11:00
Ines Montani
991669c934 Tidy up and auto-format 2021-01-05 13:41:53 +11:00
Thomas Bird
cbb8c66da3 prevent the root logger from inialising 2020-12-15 19:50:34 +00:00
Ines Montani
1980203229 Merge branch 'master' into pr/6444 2020-12-09 11:09:40 +11:00
Koichi Yasuoka
0afb54ac93
JapaneseTokenizer.pipe added (#6515)
* JapaneseTokenizer.pipe added

For [spacymoji](https://spacy.io/universe/project/spacymoji)  with `Japanese()`.

* DummyTokenizer.pipe added instead
2020-12-08 20:02:23 +01:00
Ines Montani
d25b1606d6 Allow reading config from sdtin in spacy train 2020-12-08 18:01:40 +11:00
svlandeg
1f465bea18 if-else 2020-10-13 09:27:19 +02:00
Ines Montani
bfa3931c9d
Revert added_strings change (#6236) 2020-10-10 18:55:07 +02:00
svlandeg
040c7c0541 fix get_dim calls in build_simple_cnn_text_classifier 2020-10-09 15:40:58 +02:00
Florijan Stamenković
18f5c309dc Fix Issue 6207 (#6208)
* Regression test for issue 6207

* Fix issue 6207

* Sign contributor agreement

* Minor adjustments to test

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2020-10-09 10:14:40 +02:00
Sofie Van Landeghem
d093d6343b
TrainablePipe (#6213)
* rename Pipe to TrainablePipe

* split functionality between Pipe and TrainablePipe

* remove unnecessary methods from certain components

* cleanup

* hasattr(component, "pipe") should be sufficient again

* remove serialization and vocab/cfg from Pipe

* unify _ensure_examples and validate_examples

* small fixes

* hasattr checks for self.cfg and self.vocab

* make is_resizable and is_trainable properties

* serialize strings.json instead of vocab

* fix KB IO + tests

* fix typos

* more typos

* _added_strings as a set

* few more tests specifically for _added_strings field

* bump to 3.0.0a36
2020-10-08 21:33:49 +02:00
Florijan Stamenković
9db670b996
Fix Issue 6207 (#6208)
* Regression test for issue 6207

* Fix issue 6207

* Sign contributor agreement

* Minor adjustments to test

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2020-10-06 11:17:37 +02:00
Ines Montani
0135f6ed95 Enable commit check via env var 2020-10-05 20:51:15 +02:00
Ines Montani
6958510bda Include spaCy version check in project CLI 2020-10-05 13:53:07 +02:00
Ines Montani
f758804401 Save one line of code 2020-10-03 11:41:28 +02:00
svlandeg
02247cccaf Merge remote-tracking branch 'upstream/develop' into feature/small-fixes 2020-10-02 20:48:11 +02:00
svlandeg
acc391c2a8 remove redundant str() call 2020-10-02 11:05:59 +02:00
Ines Montani
01c1538c72 Integrate file readers 2020-10-02 01:36:06 +02:00
Ines Montani
23c63eefaf Tidy up env vars [ci skip] 2020-09-30 15:15:11 +02:00
Ines Montani
a5debb356d Tidy up and adjust logging [ci skip] 2020-09-30 01:22:08 +02:00
Ines Montani
da30bae8a6 Use __pyx_vtable__ instead of __reduce_cython__ 2020-09-29 22:04:17 +02:00
Ines Montani
fa47f87924 Tidy up and auto-format 2020-09-29 21:39:28 +02:00
Ines Montani
d3c63b7965 Merge branch 'develop' into feature/prepare 2020-09-29 20:53:05 +02:00
Ines Montani
2be80379ec Fix small issues, resolve_dot_names and debug model 2020-09-29 20:38:35 +02:00
Ines Montani
dba26186ef Handle None default args in Cython methods 2020-09-29 18:08:02 +02:00
Ines Montani
9353a82076 Auto-format 2020-09-29 18:07:48 +02:00
Matthew Honnibal
4ad26f4a2f Move reader 2020-09-29 16:54:53 +02:00
Ines Montani
2e9c9e74af Fix config resolution and interpolation
TODO: auto-interpolate in Thinc if config is dict (i.e. likely subsection)
2020-09-28 15:34:00 +02:00
Ines Montani
02838a1d47 Fix resolve_dot_names 2020-09-28 15:27:10 +02:00
Ines Montani
822ea4ef61 Refactor CLI 2020-09-28 15:09:59 +02:00
Ines Montani
e44a7519cd Update CLI and add [initialize] block 2020-09-28 11:56:14 +02:00
Ines Montani
d5155376fd Update vocab init 2020-09-28 11:30:18 +02:00
Matthew Honnibal
65448b2e34 Remove schema=None until Optional 2020-09-28 03:42:58 +02:00
Matthew Honnibal
a023cf3ecc Add (untested) resolve_dot_names util 2020-09-28 03:06:12 +02:00
Matthew Honnibal
a976da168c
Support data augmentation in Corpus (#6155)
* Support data augmentation in Corpus

* Note initial docs for data augmentation

* Add augmenter to quickstart

* Fix flake8

* Format

* Fix test

* Update spacy/tests/training/test_training.py

* Improve data augmentation arguments

* Update templates

* Move randomization out into caller

* Refactor

* Update spacy/training/augment.py

* Update spacy/tests/training/test_training.py

* Fix augment

* Fix test
2020-09-28 03:03:27 +02:00
Ines Montani
9016d23cc5 Fix exclude and add test 2020-09-27 23:34:03 +02:00
Ines Montani
7e938ed63e Update config resolution to use new Thinc 2020-09-27 22:21:31 +02:00
Ines Montani
26e28ed413 Fix combined scores if multiple components report it 2020-09-24 17:11:13 +02:00