Sofie Van Landeghem
2c27093c5f
require_cpu functionality ( #6336 )
...
* add require_cpu from Thinc 8.0.0rc2
* add docs
* fix test if cupy is not installed
2020-12-08 14:42:40 +08:00
Ines Montani
d7950c5ada
Merge pull request #6297 from adrianeboyd/docs/nightly-conda-install [ci skip]
2020-11-10 02:45:52 +01:00
Adriane Boyd
1c4df8fd09
Replace pytokenizations with internal alignment ( #6293 )
...
* Replace pytokenizations with internal alignment
Replace pytokenizations with internal alignment algorithm that is
restricted to only allow differences in whitespace and capitalization.
* Rename `spacy.training.align` to `spacy.training.alignment` to contain
the `Alignment` dataclass
* Implement `get_alignments` in `spacy.training.align`
* Refactor trailing whitespace handling
* Remove unnecessary exception for empty docs
Allow a non-empty whitespace-only doc to be aligned with an empty doc
* Remove empty docs exceptions completely
2020-11-03 16:24:38 +01:00
Sofie Van Landeghem
ace6ae435b
set pydantic upper pin to 1.7 for now ( #6308 )
2020-10-26 23:31:08 +01:00
Adriane Boyd
4299a7f654
Setup / install / quickstart updates
...
* Add `cuda110` to setup.cfg and quickstart dropdown
* Switch to `pip` for pip-only packages in conda quickstart instructions
* Update zh pkuseg install message with version range and conda
* Remove `zh` from `extras_require` because the default doesn't require
additional packages
2020-10-23 11:27:54 +02:00
Adriane Boyd
3629296757
Fix requirements, remove version pins
2020-10-19 19:04:42 +02:00
Adriane Boyd
56077e7e64
Add dependency for jinja2
2020-10-19 18:58:15 +02:00
Ines Montani
2e8dcba379
Update version pins
2020-10-14 14:59:09 +02:00
Ines Montani
74972744e5
Update Thinc
2020-10-10 19:08:57 +02:00
Ines Montani
59558b1b80
Update pin [ci skip]
2020-10-08 23:09:14 +02:00
Ines Montani
1e7560f327
Update pin [ci skip]
2020-10-08 11:10:48 +02:00
Ines Montani
43e59bb22a
Update docs and install extras [ci skip]
2020-10-08 10:58:50 +02:00
Ines Montani
b79a420c20
Adjust version pin [ci skip]
2020-10-07 13:16:56 +02:00
Sofie Van Landeghem
fff3f8ccfa
Fix packaging pin ( #6212 )
...
* pin packaging to >=20.0
* ignore spacy-pkuseg in requirements unit test
2020-10-06 14:16:05 +02:00
Ines Montani
4cf73d85bc
Add [zh] to extras [ci skip]
2020-10-05 21:37:09 +02:00
Sofie Van Landeghem
f4f49f5877
update blis ( #6198 )
...
* allow higher blis version
* fix typo
* bump to 3.0.0a34
* fix pins in other files
2020-10-05 14:58:56 +02:00
Ines Montani
52e4586ec1
Add transformers to extras_require [ci skip]
2020-10-03 11:13:00 +02:00
Ines Montani
6d8df081bd
Merge pull request #6180 from adrianeboyd/docs/minor-v3-2 [ci skip]
2020-10-02 11:37:25 +02:00
Adriane Boyd
351f352cdc
Update Japanese docs and pin for sudachipy
2020-10-02 10:12:44 +02:00
Ines Montani
01c1538c72
Integrate file readers
2020-10-02 01:36:06 +02:00
Ines Montani
95b2a448cf
Update lookups data pin [ci skip]
2020-09-30 00:24:42 +02:00
Ines Montani
7d04ba20c0
Update Thinc
2020-09-30 00:05:17 +02:00
Ines Montani
d3c63b7965
Merge branch 'develop' into feature/prepare
2020-09-29 20:53:05 +02:00
svlandeg
cd21eb2485
upgrade pydantic pin for thinc's field.default_factory
2020-09-28 16:45:48 +02:00
Ines Montani
e44a7519cd
Update CLI and add [initialize] block
2020-09-28 11:56:14 +02:00
Ines Montani
c0c842ae5b
Update Thinc version
2020-09-27 23:24:40 +02:00
Ines Montani
7e938ed63e
Update config resolution to use new Thinc
2020-09-27 22:21:31 +02:00
Ines Montani
ca3c997062
Improve CLI config validation with latest Thinc
2020-09-26 13:13:57 +02:00
Sofie Van Landeghem
009ba14aaf
Fix pretraining in train script ( #6143 )
...
* update pretraining API in train CLI
* bump thinc to 8.0.0a35
* bump to 3.0.0a26
* doc fixes
* small doc fix
2020-09-25 15:47:10 +02:00
Ines Montani
76bbed3466
Use Literal type for nr_feature_tokens
2020-09-23 16:00:03 +02:00
Sofie Van Landeghem
39872de1f6
Introducing the gpu_allocator ( #6091 )
...
* rename 'use_pytorch_for_gpu_memory' to 'gpu_allocator'
* --code instead of --code-path
* update documentation
* avoid querying the "system" section directly
* add explanation of gpu_allocator to TF/PyTorch section in docs
* fix typo
* fix typo 2
* use set_gpu_allocator from thinc 8.0.0a34
* default null instead of empty string
2020-09-19 01:17:02 +02:00
svlandeg
0dc914b667
bump thinc to 8.0.0a33
2020-09-16 16:42:58 +02:00
Ines Montani
a25bb50e36
Merge pull request #6036 from explosion/chore/update-lookups-data
...
Update to latest spacy-lookups-data
2020-09-09 21:47:17 +02:00
Sofie Van Landeghem
60f22e1800
Pipe API ( #6034 )
...
* ensure Language passes on valid examples for initialization
* fix tagger model initialization
* check for valid get_examples across components
* assume labels were added before begin_training
* fix senter initialization
* fix morphologizer initialization
* use methods to check arguments
* test textcat init, requires thinc>=8.0.0a31
* fix tok2vec init
* fix entity linker init
* use islice
* fix simple NER
* cleanup debug model
* fix assert statements
* fix tests
* throw error when adding a label if the output layer can't be resized anymore
* fix test
* add failing test for simple_ner
* UX improvements
* morphologizer UX
* assume begin_training gets a representative set and processes the labels
* remove assumptions for output of untrained NER model
* restore test for original purpose
2020-09-08 22:44:25 +02:00
Ines Montani
40058ee626
Update to latest spacy-lookups-data
2020-09-08 12:23:06 +02:00
Ines Montani
ff4175e839
Add more info to debug config
2020-08-27 18:17:58 +02:00
Ines Montani
3aec98ca38
Update wasabi: new diff_strings and MarkdownRenderer
2020-08-26 15:33:11 +02:00
Ines Montani
e12b03358b
Support removing extra values in fill-config ( #5966 )
...
* Support removing extra values in fill-config
* Fix test
2020-08-24 22:53:47 +02:00
Matthew Honnibal
463f1c8623
Avoid requiring smart-open directly
2020-08-24 14:49:17 +02:00
Matthew Honnibal
e559867605
Allow spacy project to push and pull to/from remote storage ( #5949 )
...
* Add utils for working with remote storage
* WIP add remote_cache for project
* WIP add push and pull commands
* Use pathy in remote_cache
* Updarte util
* Update remote_cache
* Update util
* Update project assets
* Update pull script
* Update push script
* Fix type annotation in util
* Work on remote storage
* Remove site and env hash
* Fix imports
* Fix type annotation
* Require pathy
* Require pathy
* Fix import
* Add a util to handle project variable substitution
* Import push and pull commands
* Fix pull command
* Fix push command
* Fix tarfile in remote_storage
* Improve printing
* Fiddle with status messages
* Set version to v3.0.0a9
* Draft docs for spacy project remote storages
* Update docs [ci skip]
* Use Thinc config to simplify and unify template variables
* Auto-format
* Don't import Pathy globally for now
Causes slow and annoying Google Cloud warning
* Tidy up test
* Tidy up and update tests
* Update to latest Thinc
* Update docs
* variables -> vars
* Update docs [ci skip]
* Update docs [ci skip]
Co-authored-by: Ines Montani <ines@ines.io>
2020-08-23 18:32:09 +02:00
Ines Montani
6ad59d59fe
Merge branch 'develop' of https://github.com/explosion/spaCy into develop [ci skip]
2020-08-20 11:20:58 +02:00
Ines Montani
daba316930
Update Thinc version
2020-08-14 18:39:51 +02:00
Ines Montani
67cc39af7f
Update Thinc and include section order
2020-08-14 14:06:22 +02:00
Ines Montani
88b0a96801
Update for new Thinc and adjust config
2020-08-13 17:38:30 +02:00
Ines Montani
955d7b1b6b
Update to latest Thinc
2020-08-07 14:41:35 +02:00
Ines Montani
ab5ef37abb
Update to latest Thinc
2020-08-05 15:00:49 +02:00
svlandeg
5fa3235d06
set DATA_VALIDATION to False for debug_model (upgrade thinc)
2020-07-31 15:21:01 +02:00
Matthew Honnibal
520d25cb50
Add smart_open dependency to fetch project assets ( #5812 )
...
* Use smart_open for project assets
* Fix assets.py
* Update pyproject.toml
2020-07-26 12:15:00 +02:00
Ines Montani
e92df281ce
Tidy up, autoformat, add types
2020-07-25 15:01:15 +02:00
Ines Montani
a063a82c40
Tidy up __init__.py
2020-07-25 12:14:37 +02:00