Commit Graph

281 Commits

Author SHA1 Message Date
Gavriel Loria
9a5003d5c8 iob converter: add 'exception' for error 'too many values' (#3159)
* added contributor agreement

* issue #3128 throw exception on bad IOB/2 formatting

* Update spacy/cli/converters/iob2json.py with ValueError

Co-Authored-By: gavrieltal <gtloria@protonmail.com>
2019-01-16 13:44:16 +01:00
Mark Neumann
e599ed9ef8 Allow vectors to be optional in init-model, more robust string counting (#3155)
* more robust init-model

* key not word

* add license agreement
2019-01-14 23:48:30 +01:00
Gavriel Loria
9c8c4287bf Accept iob2 and allow generic whitespace (#2999)
* accept non-pipe whitespace as delimiter; allow iob2 filename

* added small documentation note for IOB2 allowance

* added contributor agreement
2018-12-06 15:50:25 +01:00
Maxim Kupfer
cebe50b5b8 Remove ')' for clarity (#2737)
Sorry, don't mean to be nitpicky, I just noticed this when going through the CLI and thought it was a quick fix. That said, if this was intention than please let me know.
2018-09-10 11:31:49 +02:00
Xiaoquan Kong
f0c9652ed1 New Feature: display more detail when Error E067 (#2639)
* Fix off-by-one error

* Add verbose option

* Update verbose option

* Update documents for verbose option
2018-08-07 10:45:29 +02:00
Kaisa (Katarzyna) Korsak
e531a827db Changed conllu2json to be able to extract NER tags (#2594)
* extract ner tags from conllu file if available

* fixed a bug in regex
2018-07-25 22:21:31 +02:00
Ole Henrik Skogstrøm
6e2930a4a2 Conll(u)-bio converter (#2525)
* Started simple conllxbiluo converter

* Fix missing BIO to BILUO conversion
2018-07-18 18:55:42 +02:00
James Messinger
4515e96e90 Better formatting for spacy train CLI (#2357)
* Better formatting for `spacy train` CLI

Changed to use fixed-spaces rather than tabs to align table headers and data.

### Before:
```
Itn.    P.Loss  N.Loss  UAS     NER P.  NER R.  NER F.  Tag %   Token %
0       4618.857        2910.004        76.172  79.645  67.987  88.732  88.261  100.000 4436.9  6376.4
1       4671.972        3764.812        74.481  78.046  62.374  82.680  88.377  100.000 4672.2  6227.1
2       4742.756        3673.473        71.994  77.380  63.966  84.494  90.620  100.000 4298.0  5983.9
```

### After:
```
Itn.  Dep Loss  NER Loss  UAS     NER P.  NER R.  NER F.  Tag %   Token %  CPU WPS  GPU WPS
0     4618.857  2910.004  76.172  79.645  67.987  88.732  88.261  100.000  4436.9   6376.4
1     4671.972  3764.812  74.481  78.046  62.374  82.680  88.377  100.000  4672.2   6227.1
2     4742.756  3673.473  71.994  77.380  63.966  84.494  90.620  100.000  4298.0   5983.9
```

* Added contributor file
2018-05-25 13:08:45 +02:00
Ines Montani
d4cc736b7c 💫 Improve model downloads: check for existing install, customise pip and use requests library again (#2346)
* Go back to using requests instead of urllib (closes #2320)

Fewer dependencies are good, but this one was simply causing too many other problems around SSL verification and Python 2/3 compatibility. requests is a popular enough package that it's okay for spaCy to depend on it – and this will hopefully make model downloads less flakey.

* Only download model if not installed (see #1456)

Use #egg=model==version to allow pip to check for existing installations. The download is only started if no installation matching the package/version is found. Fixes a long-standing inconvenience.

* Pass additional options to pip when installing model (resolves #1456)

Treat all additional arguments passed to the download command as pip options to allow user to customise the command. For example:

python -m spacy download en --user

* Add CLI option to enable installing model package dependencies

* Revert "Add CLI option to enable installing model package dependencies"

This reverts commit 9336ffe695.

* Update documentation
2018-05-20 20:26:56 +02:00
ines
7a3599c21a Fix formatting and consistency 2018-05-07 23:02:11 +02:00
G.Pruvost
cc8e804648 #2211 - Support for ssl certs config on download command (#2212)
* Add support for SSL/Certs customization on download CLI

* Add a note on SSL options for the 'download' CLI in the README

* Add contributor agreement
2018-05-03 18:37:02 +02:00
ines
3c80f69ff5 Return data in cli.info and add silent option (resolves #2196) 2018-04-29 01:59:44 +02:00
ines
0299d5fac8 Update argument annotations and formatting 2018-04-10 21:45:11 +02:00
ines
49b1e48bf5 Fix syntax error 2018-04-10 21:44:59 +02:00
ines
70052e46e9 Fix formatting [ci skip] 2018-04-10 21:42:46 +02:00
Matthew Honnibal
0ddb152be0 Improve error message when reading vectors 2018-04-10 21:26:50 +02:00
Matthew Honnibal
db50ac524e Support zipped vector files in init-model 2018-04-10 21:21:00 +02:00
ines
270fcfd925 Fix typo in package command message (closes #2200) 2018-04-10 19:14:31 +02:00
ines
24d8bf348d Revert "Add support for .zip to init_model"
This reverts commit 7ee880a0ad.
2018-04-10 19:08:06 +02:00
Matthew Honnibal
7ee880a0ad Add support for .zip to init_model 2018-04-10 14:30:04 +00:00
Ines Montani
3141e04822
💫 New system for error messages and warnings (#2163)
* Add spacy.errors module

* Update deprecation and user warnings

* Replace errors and asserts with new error message system

* Remove redundant asserts

* Fix whitespace

* Add messages for print/util.prints statements

* Fix typo

* Fix typos

* Move CLI messages to spacy.cli._messages

* Add decorator to display error code with message

An implementation like this is nice because it only modifies the string when it's retrieved from the containing class – so we don't have to worry about manipulating tracebacks etc.

* Remove unused link in spacy.about

* Update errors for invalid pipeline components

* Improve error for unknown factories

* Add displaCy warnings

* Update formatting consistency

* Move error message to spacy.errors

* Update errors and check if doc returned by component is None
2018-04-03 15:50:31 +02:00
Ines Montani
a609a1ca29
Merge pull request #2152 from explosion/feature/tidy-up-dependencies
💫 Tidy up dependencies
2018-03-29 14:35:09 +02:00
Matthew Honnibal
b5098079d8 Fix error on urllib 2018-03-29 00:08:16 +02:00
Ines Montani
98e9cda677
Merge pull request #2158 from explosion/feature/fix-multiple-vectors (resolves #1660)
💫 Fix loading of multiple vector models
2018-03-28 23:08:24 +02:00
Matthew Honnibal
17c3e7efa2 Add message noting vectors 2018-03-28 16:33:43 +02:00
ines
7fbc9e5874 Replace requests with urllib 2018-03-28 12:46:07 +02:00
ines
ac88c72c9a Fix ftfy workaround and remove old import 2018-03-28 12:14:28 +02:00
Matthew Honnibal
070b6c6495 Remove dependency on ftfy 2018-03-28 12:07:02 +02:00
Matthew Honnibal
406548b976 Support .gz and .tar.gz files in spacy init-model 2018-03-24 17:18:32 +01:00
Matthew Honnibal
86405e4ad1 Fix CLI for multitask objectives 2018-02-18 10:59:11 +01:00
Matthew Honnibal
a34749b2bf Add multitask objectives options to train CLI 2018-02-17 22:03:54 +01:00
Matthew Honnibal
262d0a3148 Fix overwriting of lexical attributes when loading vectors during training 2018-02-17 18:11:11 +01:00
Johannes Dollinger
bf94c13382 Don't fix random seeds on import 2018-02-13 12:42:23 +01:00
Ali Zarezade
9df9da34a3
Fix init_model issue
Fixing issue #1928
2018-02-03 17:21:34 +03:30
ines
3c1fb9d02d Make validate command fail more gracefully if version not found
Mostly relevant during develoment when working with .dev versions
2018-01-31 22:06:28 +01:00
Adam Binford
1a2c2f7d7f Fixed auto linking after download and added simple test to check 2018-01-29 14:25:21 -05:00
Matthew Honnibal
7ca49c2061
Merge branch 'master' into feature-improve-model-download 2018-01-10 18:21:55 +01:00
Søren Lind Kristiansen
10dab8eef8 Remove dummy variable from function calls 2018-01-05 09:37:05 +01:00
Søren Lind Kristiansen
7f0ab145e9 Don't pass CLI command name as dummy argument 2018-01-04 21:33:47 +01:00
ines
2c656f90fb Exit with 1 if incompatible models found (see #1714) 2018-01-03 21:20:35 +01:00
ines
dacfaa2ca4 Ensure that download command exits properly (resolves #1714) 2018-01-03 21:03:36 +01:00
Søren Lind Kristiansen
a9ff6eadc9 Prefix dummy argument names with underscore 2018-01-03 20:48:12 +01:00
ines
1081e08efb Fix formatting 2018-01-03 20:14:50 +01:00
ines
d8109964d6 Use --no-deps on model install
In general, it's nice for models to specify spaCy as a dependency. However, this tends to cause problems in conda environments, as pip will re-install spaCy and its dependencies (especially Thinc)
2018-01-03 17:40:37 +01:00
ines
319d754309 Fix overwriting of existing symlinks
Check for is_symlink() to also overwrite invalid and outdated symlinks. Also show better error message if link path exists but is not symlink (i.e. file or directory).
2018-01-03 17:39:36 +01:00
ines
8ba0dfd017 Make message on failed linking more clear 2018-01-03 17:38:09 +01:00
Søren Lind Kristiansen
d6327e8495 Fix handling case when vectors not specified 2018-01-03 12:20:49 +01:00
Søren Lind Kristiansen
bcc51d7d8b Fix shifted positional arguments 2018-01-03 12:19:47 +01:00
Søren Lind Kristiansen
5a9d377580 Remove abbreviation for positional plac argument 2017-12-11 11:08:29 +01:00
Isaac Sijaranamual
20ae0c459a Fixes "Error saving model" #1622 2017-12-10 23:07:13 +01:00