Ines Montani
ee2ec52f48
Merge pull request #6409 from svlandeg/feature/trf-docs
2020-12-08 06:32:10 +01:00
Ines Montani
c2b196c2c1
Merge pull request #6419 from svlandeg/feature/rel-docs
2020-12-08 06:30:41 +01:00
Ines Montani
82e88f0e3b
Merge pull request #6379 from svlandeg/fix/labels-constructor
2020-12-08 06:29:56 +01:00
Sofie Van Landeghem
52fa46dd58
tested EL scripts with 2.3.4 ( #6517 )
2020-12-07 20:46:38 +01:00
Adriane Boyd
d70950605c
Warn on empty POS for the rule-based lemmatizer
...
Add a warning to the rule-based lemmatizer for any tokens without POS
annotation.
2020-12-04 11:46:15 +01:00
Adriane Boyd
78085fab1f
Check for spacy-nightly package in download ( #6502 )
...
Also check for spacy-nightly in download so that `--no-deps` isn't set
for normal nightly installs.
2020-12-04 09:40:03 +01:00
Ines Montani
63f83e7034
Merge pull request #6470 from adrianeboyd/feature/license-in-package
2020-12-04 03:55:54 +01:00
Sofie Van Landeghem
d6c616a125
Fixes in test suite ( #6457 )
...
* fix slow test for textcat readers
* cleanup test_issue5551
* add explicit score weight
* cleanup
2020-12-02 12:57:08 +01:00
Adriane Boyd
31ec9a906e
Clean up 3rd party license info ( #6478 )
...
Move scikit-learn license from `Scorer` to
`licenses/3rd_party_licenses.txt`.
2020-12-02 10:15:23 +01:00
Adriane Boyd
591cd48aa8
Remove config.cfg from MANIFEST
2020-12-01 12:58:02 +01:00
Adriane Boyd
b0dd13e0ba
Support LICENSE in spacy package
...
If present, include the file `input_dir/LICENSE` at the top level of the
packaged model.
2020-11-30 13:43:58 +01:00
Adriane Boyd
1442d2f213
Improve simple training example in v3 migration ( #6438 )
...
* Create the examples once
* Use the examples in the initialization
* Provide the batch size
* Fix `begin_training` migration example
2020-11-30 09:39:45 +08:00
Adriane Boyd
53c0fb7431
Only set NORM on Token in retokenizer ( #6464 )
...
* Only set NORM on Token in retokenizer
Instead of setting `NORM` on both the token and lexeme, set `NORM` only
on the token.
The retokenizer tries to set all possible attributes with
`Token/Lexeme.set_struct_attr` so that it doesn't have to enumerate
which attributes are available for each. `NORM` is the only attribute
that's stored on both and for most cases it doesn't make sense to set
the global norms based on a individual retokenization. For lexeme-only
attributes like `IS_STOP` there's no way to avoid the global side
effects, but I think that `NORM` would be better only on the token.
* Fix test
2020-11-30 09:35:42 +08:00
Adriane Boyd
03ae77e603
Add SPACY as a Matcher attribute ( #6463 )
2020-11-30 09:34:50 +08:00
Sofie Van Landeghem
079f6ea474
avoid resolving the full config ( #6465 )
2020-11-30 09:34:29 +08:00
Ines Montani
9beba7164f
Make jinja2 top-level import
...
No problem anymore since it's now an official dependency
2020-11-27 15:17:14 +08:00
Ines Montani
d21d2c2e59
Don't multiply accuracy by 100
2020-11-27 15:15:51 +08:00
Adriane Boyd
26296ab223
Add error message if DocBin zlib decompress fails ( #6394 )
...
Add a better error message if DocBin zlib decompress fails, indicating
that the data is not in `DocBin` format.
2020-11-27 14:39:49 +08:00
Adriane Boyd
3a5cc5f8b4
Set version to v2.3.4
2020-11-26 08:48:52 +01:00
Adriane Boyd
e0f5646a4a
Restore cleanup_beam method ( #6446 )
2020-11-25 13:21:48 +01:00
Adriane Boyd
40c583a41b
Remove --prefer-binary and --only-binary from CI
2020-11-25 12:24:11 +01:00
Adriane Boyd
cf693f0eae
Fix token_match in tokenizer
2020-11-25 11:49:34 +01:00
Adriane Boyd
724831b066
Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master
...
* Update Macedonian for v3
* Update Turkish for v3
2020-11-25 11:49:34 +01:00
Jacob Bortell
fe9009911a
Update rule-based-matching.md ( #6421 )
...
* Update rule-based-matching.md
Clarified case-sensititivy of dictionary-referencing attributes (POS/TAG/DEP/etc).
Clarified "Type" column header to "Value Type"
* Update rule-based-matching.md
Improved clarity of wording
2020-11-24 16:20:19 +01:00
Jacob Bortell
992723dfac
Add jabortell to the contributors ( #6422 )
...
* Add jabortell to the contributors
* Update jabortell.md
Added tick to applicable statement
2020-11-24 16:15:31 +01:00
Adriane Boyd
6f133877aa
Update source install instructions
...
* Don't recommend an editable install in the default source
instructions.
* Use `pip install --no-build-isolation` for editable installs.
* Remove reference to `virtualenv`.
2020-11-24 14:44:13 +01:00
Adriane Boyd
afd744bc05
Update Travis CI pip install steps ( #6440 )
2020-11-24 14:10:16 +01:00
Adriane Boyd
573f5c863f
Fix tag map clobbering in spacy train ( #6437 )
...
Fix bug from #5768 where the tag map is clobbered if a custom tag map
isn't provided.
2020-11-24 13:13:16 +01:00
Adriane Boyd
ce18fc6588
Set version to v2.3.3
2020-11-24 10:03:45 +01:00
Adriane Boyd
cd61d264ef
Set version to v2.3.3.dev0
2020-11-23 13:51:59 +01:00
Sofie Van Landeghem
2af31a8c8d
Bugfix textcat reproducibility on GPU ( #6411 )
...
* add seed argument to ParametricAttention layer
* bump thinc to 7.4.3
* set thinc version range
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2020-11-23 12:29:35 +01:00
Adriane Boyd
cdca44ac11
Dynamically include numpy headers ( #6418 )
...
* Dynamically include numpy headers
* Add `build-constraints.txt` with numpy version pins for building wheels with `pip` and `wheelwright`
* Update `setup.py` to add current numpy include directory
* Assume `cython` and `numpy` are installed for `setup.py`
* Remove included numpy headers
* Fix typo in requirements.txt
* Use script in CI
2020-11-23 11:15:11 +01:00
Adriane Boyd
3f61f5eb54
Use int8_t instead of char in Matcher ( #6413 )
...
* Use signed char instead of char in Matcher
Remove unused char* utf8_t typedef
* Use int8_t instead of signed char
2020-11-23 10:26:47 +01:00
Adriane Boyd
4284605683
Remove Beam cleanup ( #6414 )
...
Beam cleanup is handled through the Beam finalization method.
2020-11-23 10:01:46 +01:00
Adriane Boyd
a8c2dad466
Add all vectors to vocab before pruning ( #6408 )
...
Add all vectors to the vocab before pruning to correct the selection of
vectors to prioritize.
2020-11-23 10:00:59 +01:00
Adriane Boyd
13f0676f04
Updates for python 3.9 ( #6338 )
...
* Update blis and thinc version ranges
* Update thinc version range
* Update setup.cfg for python 3.9
* Adjust blis and thinc ranges
* Add python 3.9 classifier
* Update CI for python 3.9
* Add --prefer-binary to CI sdist install
* Update CI python 3.7 mac image
* Add --prefer-binary to Travis CI
* Update install instructions in README
* Specify blis versions separately for < / >= 3.6
* Update --prefer-binary in README
* Test cleaner sdist install
* Also upgrade pip
(This is kind of unnecessary given --prefer-binary but may avoid other
issues related to sdist installs in the future.)
* Compile with -j 2
* Remove wheel from setup_requires
* Update to have separate CI uninstall step
* Remove wheel from pyproject.toml
* Recommend upgrading setuptools in addition to pip
2020-11-23 09:45:18 +01:00
Yusuke Mori
e3ac90b035
Avoid a SyntaxError in self-attentive-parser ( #6428 )
...
* Avoid a SyntaxError in self-attentive-parser
Fix a usage of quotation marks in the example of spaCy Universe self-attentive-parser
* Create forest1988.md
Fill in the spaCy contributor agreement
2020-11-22 21:59:37 +01:00
svlandeg
218abaa69a
typo
2020-11-20 22:36:49 +01:00
svlandeg
e861e928df
more small corrections
2020-11-20 22:29:58 +01:00
svlandeg
5ac0867427
final fixes
2020-11-20 22:18:53 +01:00
svlandeg
331ec83493
edits and updates to implementing REL component docs
2020-11-20 21:41:52 +01:00
svlandeg
4a3e611abc
small fixes and formatting
2020-11-20 15:55:05 +01:00
svlandeg
124f49feb6
update REL model code
2020-11-20 15:25:20 +01:00
svlandeg
636be3c791
Merge remote-tracking branch 'upstream/develop' into feature/trf-docs
2020-11-19 14:15:35 +01:00
Sofie Van Landeghem
165993d8e5
fix typo in transformer docs ( #6404 )
2020-11-19 14:11:38 +01:00
M. Revuelta Espinosa
51232ffb9e
Update universe.json (include PatternOmatic) ( #6399 )
...
Request to include PatternOmatic in spaCy Universe
Adds @revuel to contributors
2020-11-19 13:15:50 +01:00
Adriane Boyd
3cf6479467
Fix JSON in #6395
2020-11-17 15:25:41 +01:00
Sam Edwardes
78913a4f95
Added spaCyTextBlob to universe.json ( #6395 )
2020-11-17 14:38:34 +01:00
Adriane Boyd
96726ec1f6
Fix DocBin init in training example ( #6396 )
2020-11-17 14:36:44 +01:00
Adriane Boyd
6f014efb97
Install dev requirements before running tests
2020-11-16 10:59:50 +01:00