Ines Montani
e5c319a051
Merge branch 'master' into spacy.io
2019-11-05 18:30:46 +01:00
Ines Montani
828ef27a32
Add warnings about 3.8 ( resolves #4593 ) [ci skip]
2019-11-05 18:30:11 +01:00
Ines Montani
fed53b1552
Update README.md
2019-11-05 18:26:47 +01:00
Ines Montani
83381018d3
Add load_from_docbin example [ci skip]
...
TODO: upload the file somewhere
2019-11-05 11:52:43 +01:00
Sofie Van Landeghem
4ec7623288
Fix conllu script ( #4579 )
...
* force extensions to avoid clash between example scripts
* fix arg order and default file encoding
* add example config for conllu script
* newline
* move extension definitions to main function
* few more encodings fixes
2019-11-04 20:31:26 +01:00
Matthew Honnibal
4e43c0ba93
Fix multiprocessing for as_tuples=True ( #4582 )
2019-11-04 20:29:03 +01:00
Ines Montani
d7a94edba6
Merge branch 'master' into spacy.io
2019-11-04 13:56:11 +01:00
Ines Montani
4b95587ad4
Update universe.json [ci skip]
2019-11-04 13:55:55 +01:00
Yash Patadia
0c396aeed4
add dframcy to universe.json ( #4580 )
2019-11-04 13:53:23 +01:00
Ines Montani
3ec231f7e1
Reorganise install_requires
2019-11-04 02:39:28 +01:00
Ines Montani
cf4ec88b38
Use latest wasabi
2019-11-04 02:38:45 +01:00
Ines Montani
d82630d7c1
Revert "Update azure-pipelines.yml"
...
This reverts commit ed1060cf59
.
2019-11-03 17:48:54 +01:00
Ines Montani
ed1060cf59
Update azure-pipelines.yml
2019-11-03 17:48:26 +01:00
Ines Montani
6ec119d976
Add error in debug-data if no dev docs are available (see #4575 )
2019-11-02 16:08:11 +01:00
adrianeboyd
56ad3a3988
Add LAS per dependency to Scorer ( #4560 )
2019-10-31 21:18:16 +01:00
Ines Montani
07ba9b4aa2
Merge branch 'master' into spacy.io
2019-10-31 17:30:42 +01:00
Matthew Honnibal
de98d66f87
Set version to v2.2.2
2019-10-31 15:53:31 +01:00
Matthw Honnibal
55f2241d72
Merge branch 'master' of https://github.com/explosion/spaCy
2019-10-31 15:37:52 +01:00
Ines Montani
df4c9ae3dc
Fix formatting [ci skip]
2019-10-31 15:10:25 +01:00
Ines Montani
59358d9b71
Remove box-decoration-break from entities in displacy ( #4564 )
2019-10-31 15:09:43 +01:00
Matthw Honnibal
8b9954d1b7
Set version to v2.2.2.dev5
2019-10-31 15:06:19 +01:00
Ines Montani
2c107f02a4
Auto-format [ci skip]
2019-10-31 15:01:56 +01:00
Matthew Honnibal
e82306937e
Put Tok2Vec refactor behind feature flag ( #4563 )
...
* Add back pre-2.2.2 tok2vec
* Add simple tok2vec tests
* Add simple tok2vec tests
* Reformat
* Fix CharacterEmbed in new tok2vec
* Fix legacy tok2vec
* Resolve circular imports
* Fix test for Python 2
2019-10-31 15:01:15 +01:00
Ines Montani
828108a57f
Update README.md [ci skip]
2019-10-31 13:23:25 +01:00
Ines Montani
5e9849b60f
Auto-format [ci skip]
2019-10-30 19:27:18 +01:00
Ines Montani
afe4a428f7
Fix pipeline analysis on remove pipe ( #4557 )
...
Validate *after* component is removed, not before
2019-10-30 19:04:17 +01:00
Matthew Honnibal
6b874ef096
Set version to v2.2.2.dev4
2019-10-30 17:36:20 +01:00
Ines Montani
85f2b04c45
Support span._. in component decorator attrs ( #4555 )
...
* Support span._. in component decorator attrs
* Adjust error [ci skip]
2019-10-30 17:19:36 +01:00
Ines Montani
86c3185f34
Update syntax iterators [ci skip]
2019-10-30 14:32:50 +01:00
Ines Montani
4e1de85e43
Update syntax iterators [ci skip]
2019-10-30 14:31:40 +01:00
Ines Montani
d8c2365b04
Update universe.json [ci skip]
2019-10-30 13:29:15 +01:00
Ines Montani
726c5dd306
Update universe.json [ci skip]
2019-10-30 13:29:00 +01:00
Neel Kamath
4cbc172cc6
Add "spaCy Server" to spaCy Universe ( #4553 )
...
* Add "spaCy Server" to spaCy Universe
* Accept the spaCy Contributor Agreement
2019-10-30 13:21:25 +01:00
Neel Kamath
6c036ab57d
Add "spaCy Server" to spaCy Universe ( #4553 )
...
* Add "spaCy Server" to spaCy Universe
* Accept the spaCy Contributor Agreement
2019-10-30 13:20:46 +01:00
Nipun Sadvilkar
6316243941
✨ project: pySBD - Python Sentence Boundary Disambiguation ( #4455 )
...
* ✨ project: pySBD - Python Sentence Boundary Disambiguation
* 📝 Update links and description
* 🐛 Fix missing comma
* Update universe.json
pysbd as a spacy component through entrypoints
* 🚨 Fix universe.json
* 📝 Update code_example
2019-10-30 12:14:49 +01:00
Nipun Sadvilkar
2a5e71232b
✨ project: pySBD - Python Sentence Boundary Disambiguation ( #4455 )
...
* ✨ project: pySBD - Python Sentence Boundary Disambiguation
* 📝 Update links and description
* 🐛 Fix missing comma
* Update universe.json
pysbd as a spacy component through entrypoints
* 🚨 Fix universe.json
* 📝 Update code_example
2019-10-30 12:13:29 +01:00
Matthew Honnibal
c2f5f9f572
Set version to v2.2.2.dev3
2019-10-29 16:37:58 +01:00
Sofie Van Landeghem
33ba9ff464
set encodings explicitly to utf8 ( #4551 )
2019-10-29 13:16:55 +01:00
Matthew Honnibal
9e210fa7fd
Fix tok2vec structure after model registry refactor ( #4549 )
...
The model registry refactor of the Tok2Vec function broke loading models
trained with the previous function, because the model tree was slightly
different. Specifically, the new function wrote:
concatenate(norm, prefix, suffix, shape)
To build the embedding layer. In the previous implementation, I had used
the operator overloading shortcut:
( norm | prefix | suffix | shape )
This actually gets mapped to a binary association, giving something
like:
concatenate(norm, concatenate(prefix, concatenate(suffix, shape)))
This is a different tree, so the layers iterate differently and we
loaded the weights wrongly.
2019-10-28 23:59:03 +01:00
Matthew Honnibal
bade60fe64
Set version to v2.2.2.dev1
2019-10-28 19:09:34 +01:00
Matthew Honnibal
b1505380ff
Fix training with vectors
2019-10-28 18:06:38 +01:00
Matthew Honnibal
a927b3a21e
Put new alignment behind flag for v2.2.2 release ( #4541 )
...
* Xfail new tokenization test
* Put new alignment behind feature flag
* Move USE_ALIGN to top of the file [ci skip]
Co-authored-by: Ines Montani <ines@ines.io>
2019-10-28 16:12:32 +01:00
Ines Montani
a90025b277
Fix serialization of extension attr values in DocBin ( #4540 )
2019-10-28 16:02:13 +01:00
tamuhey
df293f3894
modified gold.align to handle space tokens ( #4537 )
...
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2019-10-28 15:44:28 +01:00
adrianeboyd
f2bfaa1b38
Filter subtoken matches in merge_subtokens() ( #4539 )
...
The `Matcher` in `merge_subtokens()` returns all possible subsequences
of `subtok`, so for sequences of two or more subtoks it's necessary to
filter the matches so that the retokenizer is only merging the longest
matches with no overlapping spans.
2019-10-28 15:40:28 +01:00
Matthew Honnibal
d5509e0989
Support Mish activation (requires Thinc 7.3) ( #4536 )
...
* Add arch for MishWindowEncoder
* Support mish in tok2vec and conv window >=2
* Pass new tok2vec settings from parser
* Syntax error
* Fix tok2vec setting
* Fix registration of MishWindowEncoder
* Fix receptive field setting
* Fix mish arch
* Pass more options from parser
* Support more tok2vec options in pretrain
* Require thinc 7.3
* Add docs [ci skip]
* Require thinc 7.3.0.dev0 to run CI
* Run black
* Fix typo
* Update Thinc version
Co-authored-by: Ines Montani <ines@ines.io>
2019-10-28 15:16:33 +01:00
Ines Montani
96bb8f2187
Add regression test for #4528 [ci skip]
2019-10-28 14:36:03 +01:00
Matthew Honnibal
02e8adf2c2
Add the spacy_lookups_data to pex file
2019-10-28 14:03:35 +01:00
Ines Montani
c5e41247e8
Tidy up and auto-format
2019-10-28 12:43:55 +01:00
Ines Montani
92018b9cd4
Tidy up and auto-format
2019-10-28 12:36:23 +01:00