* Fix surprises when asking for the root of a git repo
In the case of the first asset I wanted to get from git, the data I
wanted was the entire repository. I tried leaving "path" blank, which
gave a less-than-helpful error, and then I tried `path: "/"`, which
started copying my entire filesystem into the project. The path I should
have used was "".
I've made two changes to make this smoother for others:
- The 'path' within a git clone defaults to ""
- If the path points outside of the tmpdir that the git clone goes
into, we fail with an error
Signed-off-by: Elia Robyn Speer <elia@explosion.ai>
* use a descriptive error instead of a default
plus some minor fixes from PR review
Signed-off-by: Elia Robyn Speer <elia@explosion.ai>
* check for None values in assets
Signed-off-by: Elia Robyn Speer <elia@explosion.ai>
Co-authored-by: Elia Robyn Speer <elia@explosion.ai>
* Add training data section
Not entirely sure this is in the right location on the page - maybe it
should be after quickstart?
* Add pointer from binary format to training data section
* Minor cleanup
* Add to ToC, fix filename
* Update website/docs/usage/training.md
Co-authored-by: Ines Montani <ines@ines.io>
* Update website/docs/usage/training.md
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update website/docs/usage/training.md
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Move the training data section further down the page
* Update website/docs/usage/training.md
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update website/docs/usage/training.md
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Run prettier
Co-authored-by: Ines Montani <ines@ines.io>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Support list values and IS_INTERSECT in Matcher
* Support list values as token attributes for set operators, not just as
pattern values.
* Add `IS_INTERSECT` operator.
* Fix incorrect `ISSUBSET` and `ISSUPERSET` in schema and docs.
* Rename IS_INTERSECT to INTERSECTS
* implement textcat resizing for TextCatCNN
* resizing textcat in-place
* simplify code
* ensure predictions for old textcat labels remain the same after resizing (WIP)
* fix for softmax
* store softmax as attr
* fix ensemble weight copy and cleanup
* restructure slightly
* adjust documentation, update tests and quickstart templates to use latest versions
* extend unit test slightly
* revert unnecessary edits
* fix typo
* ensemble architecture won't be resizable for now
* use resizable layer (WIP)
* revert using resizable layer
* resizable container while avoid shape inference trouble
* cleanup
* ensure model continues training after resizing
* use fill_b parameter
* use fill_defaults
* resize_layer callback
* format
* bump thinc to 8.0.4
* bump spacy-legacy to 3.0.6
* Minor updates to quickstart settings/instructions
* set default value of textcat exclusive to `false` until the default
checkbox behavior is updated
* add the `morphologizer` to the list of components
* add a note that v3.0.6+ is required
* Switch to warning above quickstart
* Undo changes to textcat default in quickstart
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>