Jan Jessewitsch
e4dcac4a4b
Merging multiple docs into one ( #5032 )
...
* Add static method to Doc to allow merging of multiple docs.
* Add error description for the error that occurs if docs with different
vocabs (from different languages) are merged in Doc.from_docs().
* Add test for Doc.from_docs() implementation.
* Fix using numpy's concatenate in Doc.from_docs.
* Replace typing's type annotations in from_docs.
* Simply remove type annotations in from_docs.
* Add documentation for Doc.from_docs to api.
* Simplify from_docs, its test and the api doc for codebase consistency.
* Fix merging of Doc objects that end with whitespaces (Achieved by simply not setting the SPACY attribute on whitespace tokens). Remove two unnecessary imports of attributes.
* Add merging of user data from Doc objects in from_docs. Add user data test case to corresponding test. Add applicable warning messages.
* Fix incorrect setting of tokens idx by using concatenated spaces (again). Add test case to corresponding test.
* Add MORPH to attrs
* Update warnings calls
* Remove out-dated error from merge
* Rename space_delimiter to ensure_whitespace
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2020-07-03 11:32:42 +02:00
Matthias Hertel
2fb9bd795d
Fixed vocabulary in the entity linker training example ( #5676 )
...
* entity linker training example: model loading changed according to issue 5668 (https://github.com/explosion/spaCy/issues/5668 ) + vocab_path is a required argument
* contributor agreement
2020-07-03 10:24:02 +02:00
Sofie Van Landeghem
41b65fd0f8
fix to pretrain script ( #5699 )
...
* fix to pretrain script
* remove unnecessary import
2020-07-02 21:48:01 +02:00
Adriane Boyd
a723fa02a1
DocBin: add version number, missing attributes and strings ( #5685 )
...
* Add version number to DocBin
Add a version number to DocBin for future use.
* Add POS to all attributes in DocBin
* Add morph string to strings in DocBin
* Update DocBin API
* Add string for ENT_KB_ID in DocBin
2020-07-02 17:41:50 +02:00
Adriane Boyd
a77c4c3465
Add strings and ENT_KB_ID to Doc serialization ( #5691 )
...
* Add strings for all writeable Token attributes to `Doc.to/from_bytes()`.
* Add ENT_KB_ID to default attributes.
2020-07-02 17:11:57 +02:00
Adriane Boyd
971826a96d
Include git commit in package and model meta ( #5694 )
...
* Include git commit in package and model meta
* Rewrite to read file in setup
* Fix file handle
2020-07-02 17:10:27 +02:00
Ines Montani
b5268955d7
Update matcher usage examples [ci skip]
2020-07-02 15:39:45 +02:00
Ines Montani
d36632553a
Merge pull request #5688 from explosion/remove-deprecated
...
Remove deprecated methods: Doc.print_tree, Doc.merge, Span.merge
2020-07-02 15:10:30 +02:00
Ines Montani
8a5b9a6d5f
Merge pull request #5693 from svlandeg/bugfix/nel-v3
2020-07-02 14:45:46 +02:00
Ines Montani
ee8a830248
Merge pull request #5687 from svlandeg/bugfix/init-model
...
Fixing init_model
2020-07-02 14:10:28 +02:00
svlandeg
04ed4d60a8
raise error when links are not aligned to tokens
2020-07-02 13:57:35 +02:00
svlandeg
f503817623
fix parsing entity links in new gold format
2020-07-02 13:48:11 +02:00
Adriane Boyd
2bd78c39e3
Fix multiple context manages in examples ( #5690 )
2020-07-02 10:36:07 +02:00
Ines Montani
aa62cdee50
Merge branch 'develop' into nightly.spacy.io
2020-07-01 22:38:23 +02:00
Ines Montani
60c2695131
Remove deprecated methods
2020-07-01 22:33:39 +02:00
Ines Montani
a4cfe9fc33
Remove inline notes on v2 changes [ci skip]
2020-07-01 22:29:22 +02:00
Ines Montani
79540e1eea
Remove bin/spacy from MANIFEST
2020-07-01 22:15:18 +02:00
Ines Montani
97342f3f99
Merge pull request #5686 from tiangolo/refactor/cli-completion
2020-07-01 22:14:48 +02:00
Ines Montani
295279f74b
Update netlify.toml [ci skip]
2020-07-01 22:06:43 +02:00
Sebastián Ramírez
b0f425971e
➖ Remove shellingham from dependencies
2020-07-01 21:47:50 +02:00
Ines Montani
6bc643d2e2
Update netlify.toml [ci skip]
2020-07-01 21:34:17 +02:00
Ines Montani
3dff412f58
Merge branch 'nightly.spacy.io' into develop [ci skip]
2020-07-01 21:33:47 +02:00
Ines Montani
2f07144f80
Update netlify.toml [ci skip]
2020-07-01 21:33:20 +02:00
Ines Montani
58a289b309
Update branch name
2020-07-01 21:28:51 +02:00
Ines Montani
fe4cfd0632
Start updating website for v3 [ci skip]
2020-07-01 21:26:39 +02:00
svlandeg
a30bc77415
bugfixing prune_vectors and vectors_loc
2020-07-01 21:00:47 +02:00
Sebastián Ramírez
b985cc4025
📄 Add spaCy Contributor Agreement
2020-07-01 20:57:21 +02:00
Sebastián Ramírez
764499246e
🔧 Update spacy CLI script entrypoint to support completion
2020-07-01 20:21:05 +02:00
Sebastián Ramírez
b02db67247
➕ Add shellingham for automatic shell detection
...
and update Typer pinning
2020-07-01 20:20:04 +02:00
Matthw Honnibal
94a0cf46fd
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2020-07-01 18:45:45 +02:00
Matthw Honnibal
6a0a27e5c2
Fix max_steps
2020-07-01 18:08:14 +02:00
Ines Montani
a4650761a8
Fix package name
...
We specify it twice because GitHub wouldn't recognise the spaCy repo as a package (e.g. for its "used by" stats) if it didn't specify the name inline
2020-07-01 16:47:26 +02:00
Ines Montani
49105034cb
Auto-format
2020-07-01 16:46:56 +02:00
Ines Montani
8d90e44d74
Fix title
2020-07-01 15:38:01 +02:00
Ines Montani
85e816738f
Merge branch 'develop' into spacy.io-develop
2020-07-01 15:37:03 +02:00
Ines Montani
8fb574900a
Update parent package and version
2020-07-01 15:35:23 +02:00
Ines Montani
4f42bcdd13
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2020-07-01 15:33:57 +02:00
Ines Montani
38f226bda8
Update images [ci skip]
2020-07-01 15:33:54 +02:00
Matthew Honnibal
0ada186dda
Set version to v3.0.0.dev14
2020-07-01 15:31:04 +02:00
Matthw Honnibal
cb51bb637b
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2020-07-01 15:17:27 +02:00
Matthw Honnibal
7734cbc34d
Set batch size in begin_training
2020-07-01 15:16:59 +02:00
Matthw Honnibal
1f7709e9a6
Improve max length check in corpus
2020-07-01 15:16:43 +02:00
Matthw Honnibal
2fa56484b2
Fix eval batch size
2020-07-01 15:16:25 +02:00
Matthw Honnibal
c5d12d1a22
Allow batch size to be set for evaluation in spacy train
2020-07-01 15:04:36 +02:00
Ines Montani
6e28760316
Fix 404 [ci skip]
2020-07-01 15:02:55 +02:00
Matthw Honnibal
f5532757a3
Filter out 0-length examples in Corpus
2020-07-01 15:02:37 +02:00
Ines Montani
7037512e55
Handle robots.txt for nightly/special deploys [ci skip]
2020-07-01 14:50:58 +02:00
Ines Montani
1220fd3e6c
Handle robots.txt for nightly/special deploys [ci skip]
2020-07-01 14:50:38 +02:00
Ines Montani
997f6eeca7
Adjust nightly site url [ci skip]
2020-07-01 14:42:59 +02:00
Ines Montani
e1eb48e932
Add nightly social image [ci skip]
2020-07-01 14:41:13 +02:00