Commit Graph

12293 Commits

Author SHA1 Message Date
Álvaro Abella Bascarán
ff0dbe5c64
Fix in docs: pipe(docs) instead of pipe(texts) (#5680)
Very minor fix in docs, specifically in this part:

```
 matcher = PhraseMatcher(nlp.vocab)
>   for doc in matcher.pipe(texts, batch_size=50):
>       pass
```

`texts` suggests the input is an iterable of strings. I replaced it for `docs`.
2020-06-30 20:00:50 +02:00
Matthias Hertel
305221f3e5 Website: fixed the token span in the text about the rule-based matching example (#5669)
* fixed token span in pattern matcher example

* contributor agreement
2020-06-30 19:58:55 +02:00
Matthias Hertel
8b0f749606
Website: fixed the token span in the text about the rule-based matching example (#5669)
* fixed token span in pattern matcher example

* contributor agreement
2020-06-30 19:58:23 +02:00
svlandeg
e7aff9c5fc bugfix exec usage in dvc.yaml 2020-06-30 18:51:20 +02:00
Ines Montani
3383d1c822
Merge pull request #5679 from svlandeg/fix/project-exp 2020-06-30 18:00:11 +02:00
svlandeg
60f97bc519 add custom warning when run_command fails 2020-06-30 17:28:43 +02:00
svlandeg
39953c7c60 fix print_run_help with new arg order 2020-06-30 17:28:09 +02:00
svlandeg
cd632d8ec2 move folder for exec argument one up 2020-06-30 17:19:36 +02:00
svlandeg
1ae6fa2554 move subcommand one place up as project_dir has default 2020-06-30 16:04:53 +02:00
svlandeg
a46b76f188 use current working dir as default throughout 2020-06-30 15:39:24 +02:00
svlandeg
b228111925 fix funny printing 2020-06-30 14:54:45 +02:00
Ines Montani
8e20505970 Resolve within working_dir context manager 2020-06-30 13:29:45 +02:00
Ines Montani
72175b5c60 Update project command 2020-06-30 13:17:26 +02:00
Ines Montani
c5e31acb06 Make working_dir yield absolute cwd path 2020-06-30 13:17:14 +02:00
Ines Montani
3aca404735 Make run_command take string and list 2020-06-30 13:17:00 +02:00
Ines Montani
7584fdafec Fix typo 2020-06-30 12:59:13 +02:00
Ines Montani
5f325b602b
Merge pull request #5674 from svlandeg/fix/small-edits 2020-06-30 12:56:14 +02:00
svlandeg
140c4896a0 split_command util function 2020-06-30 12:54:15 +02:00
Matthw Honnibal
57e09747dc Improve efficiency of get_oracle_sequences 2020-06-30 11:50:48 +02:00
Matthw Honnibal
233945bfe0 Fix init for padding 2020-06-30 11:50:24 +02:00
svlandeg
d23be563eb remove redundant setting of no_args_is_help 2020-06-30 11:23:35 +02:00
svlandeg
b311ce982f Merge remote-tracking branch 'upstream/develop' into fix/small-edits
# Conflicts:
#	spacy/cli/project.py
2020-06-30 11:17:31 +02:00
svlandeg
7e4cbda89a fix project_init for relative path 2020-06-30 11:09:53 +02:00
Matthw Honnibal
85ed5730a2 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-06-30 01:14:16 +02:00
Ines Montani
e8033df81e Also handle python3 and pip3 2020-06-29 20:30:42 +02:00
Ines Montani
c874dde66c Show help on "spacy project" 2020-06-29 20:11:34 +02:00
Ines Montani
1d2c646e57 Fix init and remove .dvc/plots 2020-06-29 20:07:21 +02:00
Matthw Honnibal
5bed6fc431 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-06-29 19:55:24 +02:00
svlandeg
1176783310 fix one more shlex.split 2020-06-29 18:37:42 +02:00
svlandeg
ff233d5743 print details on error msg (e.g. PermissionError on specific file) 2020-06-29 18:22:33 +02:00
svlandeg
894b8e7ff6 throw warning (instead of crashing) when temp dir can't be cleaned 2020-06-29 18:16:39 +02:00
svlandeg
efe7eb71f2 create subfolder in working dir 2020-06-29 17:46:08 +02:00
svlandeg
3487214ba1 fix shlex.split for non-posix 2020-06-29 17:45:47 +02:00
Ines Montani
126050f259 Improve asset fetching
Get all paths first and run dvc add once so it only shows one progress bar and one combined git command (if repo is git repo)
2020-06-29 16:55:24 +02:00
Ines Montani
7c08713baa Improve error messages 2020-06-29 16:54:47 +02:00
Ines Montani
24664efa23 Import project_run_all function 2020-06-29 16:54:19 +02:00
svlandeg
f8dddeda27 print help msg when just calling 'project' without args 2020-06-29 16:38:15 +02:00
svlandeg
bf43ebbf61 fix typo's 2020-06-29 16:32:25 +02:00
Matthew Honnibal
67928036f2 Set version to v3.0.0.dev12 2020-06-29 14:45:43 +02:00
Matthew Honnibal
2d715451a2
Revert "Convert custom user_data to token extension format for Japanese tokenizer (#5652)" (#5665)
This reverts commit 1dd38191ec.
2020-06-29 14:34:15 +02:00
Sofie Van Landeghem
8d3c0306e1
refactor fixes (#5664)
* fixes in ud_train, UX for morphs

* update pyproject with new version of thinc

* fixes in debug_data script

* cleanup of old unused error messages

* remove obsolete TempErrors

* move error messages to errors.py

* add ENT_KB_ID to default DocBin serialization

* few fixes to simple_ner

* fix tags
2020-06-29 14:33:00 +02:00
Adriane Boyd
1dd38191ec
Convert custom user_data to token extension format for Japanese tokenizer (#5652)
* Convert custom user_data to token extension format

Convert the user_data values so that they can be loaded as custom token
extensions for `inflection`, `reading_form`, `sub_tokens`, and `lemma`.

* Reset Underscore state in ja tokenizer tests
2020-06-29 14:20:26 +02:00
Adriane Boyd
167df42cb6
Move lemmatizer is_base_form to language settings (#5663)
Move `Lemmatizer.is_base_form` to the language settings so that each
language can provide a language-specific method as
`LanguageDefaults.is_base_form`.

The existing English-specific `Lemmatizer.is_base_form` is moved to
`EnglishDefaults`.
2020-06-29 14:16:57 +02:00
Sofie Van Landeghem
fc3cb1fa9e
NER align tests (#5656)
* one_to_man works better. misalignment doesn't yet.

* fix tests

* restore example

* xfail alignment tests
2020-06-29 13:59:17 +02:00
Matthew Honnibal
2d9604d39c Set version to v3.0.0.dev11 2020-06-29 13:56:46 +02:00
Matthew Honnibal
0a54022138 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-06-29 13:56:20 +02:00
Matthew Honnibal
acbf6345c9 Fix thinc dependency 2020-06-29 13:56:07 +02:00
Matthw Honnibal
da50473701 Tweak efficiency of arc_eager.set_costs 2020-06-29 12:17:41 +02:00
Ines Montani
bac8a8d766 Merge branch 'feature/project-cli' into develop 2020-06-29 10:49:05 +02:00
Sofie Van Landeghem
cfeb2ba4d7
updating thinc also in pyproject.toml 2020-06-29 09:51:20 +02:00