Matthew Honnibal
2a8137aba9
Merge pull request #5518 from svlandeg/fix/pretrain-docs
...
Pretrain fixes
2020-05-29 19:20:20 +02:00
svlandeg
291483157d
prevent loading a pretrained Tok2Vec layer AND pretrained components
2020-05-29 17:38:33 +02:00
Adriane Boyd
e1b7cbd197
Remove MorphAnalysis __str__ and __repr__
2020-05-29 14:33:47 +02:00
svlandeg
04ba37b667
fix description
2020-05-29 13:52:39 +02:00
svlandeg
5f0a91cf37
fix conv-depth parameter
2020-05-29 09:56:29 +02:00
Ines Montani
4fd087572a
WIP: improve model version deps
2020-05-28 12:51:37 +02:00
Matthw Honnibal
58750b06f8
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2020-05-27 22:18:36 +02:00
Matthew Honnibal
aecd1437cc
Merge pull request #5508 from adrianeboyd/bugfix/tag-map-sp-tag
...
Prefer _SP over SP for default tag map space attrs
2020-05-27 20:39:40 +02:00
Matthew Honnibal
e7ac12b598
Merge pull request #5514 from adrianeboyd/bugfix/load-vector-name
...
Improve vector name loading from model meta
2020-05-27 20:39:23 +02:00
Adriane Boyd
25de2a2191
Improve vector name loading from model meta
2020-05-27 14:48:54 +02:00
adrianeboyd
aad0610a85
Map NR to PROPN ( #5512 )
2020-05-26 22:30:53 +02:00
Sofie Van Landeghem
f00488ab30
Update train_intent_parser.py
2020-05-26 16:41:39 +02:00
Adriane Boyd
b6b5908f5e
Prefer _SP over SP for default tag map space attrs
...
If `_SP` is already in the tag map, use the mapping from `_SP` instead
of `SP` so that `SP` can be a valid non-space tag. (Chinese has a
non-space tag `SP` which was overriding the mapping of `_SP` to
`SPACE`.)
2020-05-26 14:57:13 +02:00
Matthew Honnibal
b0c0271a48
Merge pull request #5506 from adrianeboyd/bugfix/pl-lemmatizer-lookup-loading
...
Fix Polish lemmatizer for deserialized models
2020-05-26 12:31:25 +02:00
Matthew Honnibal
a44d51a3d8
Merge pull request #5496 from explosion/docs/unicode-str
...
unicode -> str consistency
2020-05-26 10:30:37 +02:00
Adriane Boyd
1eed101be9
Fix Polish lemmatizer for deserialized models
...
Restructure Polish lemmatizer not to depend on lookups data in
`__init__` since the lemmatizer is initialized before the lookups data
is loaded from a saved model. The lookups tables are accessed first in
`__call__` instead once the data is available.
2020-05-26 09:56:12 +02:00
adrianeboyd
69897b45d8
Handle spacy.pex renaming in Makefile ( #5503 )
2020-05-25 16:39:22 +02:00
adrianeboyd
c9c7b135c0
Update Makefile for v2.3.0 ( #5502 )
2020-05-25 15:24:24 +02:00
Ines Montani
24ef6680fa
Merge pull request #5499 from adrianeboyd/chore/bump-version-deps-v2.3.0
2020-05-25 13:25:45 +02:00
Ines Montani
ade4767e06
Merge pull request #5498 from adrianeboyd/bugfix/phrasematcher-unpickle-new-api
2020-05-25 13:25:07 +02:00
Adriane Boyd
3f727bc539
Switch to v2.3.0.dev0
2020-05-25 12:57:20 +02:00
Adriane Boyd
736f3cb5af
Bump version and deps for v2.3.0
...
* spacy to v2.3.0
* thinc to v7.4.1
* spacy-lookups-data to v0.3.2
2020-05-25 12:03:49 +02:00
Rajat
8b8efa1b42
update spacy universe with my project ( #5497 )
...
* added contextualSpellCheck in spacy universe meta
* removed extra formatting by code
* updated with permanent links
* run json linter used by spacy
* filled SCA
* updated the description
2020-05-25 11:30:23 +02:00
Adriane Boyd
e06ca7ea24
Switch to new add API in PhraseMatcher unpickle
2020-05-25 11:22:47 +02:00
Ines Montani
1a15896ba9
unicode -> str consistency [ci skip]
2020-05-24 18:51:10 +02:00
Ines Montani
262d306eaa
unicode -> str consistency
2020-05-24 17:23:00 +02:00
Ines Montani
5d3806e059
unicode -> str consistency
2020-05-24 17:20:58 +02:00
Ines Montani
cf156ed2f4
Merge pull request #5495 from explosion/fix/simplify-is-package
2020-05-24 15:42:55 +02:00
Ines Montani
387c7aba15
Update test
2020-05-24 14:55:16 +02:00
Ines Montani
f9786d765e
Simplify is_package check
2020-05-24 14:48:56 +02:00
Sofie Van Landeghem
ae1c179f3a
Remove the nested quote
2020-05-23 17:58:19 +02:00
Ines Montani
15d3a0ac3a
Merge pull request #5491 from explosion/chore/rename-pipe-analysis
2020-05-23 12:41:54 +02:00
Matthw Honnibal
2d9de8684d
Support use_pytorch_for_gpu_memory config
2020-05-22 23:10:40 +02:00
Jannis
aa53ce6996
Documentation Typo Fix ( #5492 )
...
* Fix typo
Change 'realize' to 'realise'
* Add contributer agreement
2020-05-22 19:50:26 +02:00
Ines Montani
4465cad6c5
Rename spacy.analysis to spacy.pipe_analysis
2020-05-22 17:42:06 +02:00
Ines Montani
25d6ed3fb8
Merge pull request #5489 from explosion/feature/connected-components
2020-05-22 17:40:11 +02:00
Ines Montani
841c05b47b
Merge pull request #5490 from explosion/fix/remove-jsonschema
2020-05-22 17:39:54 +02:00
Ines Montani
569a65b60e
Auto-format
2020-05-22 16:55:42 +02:00
Ines Montani
d844528c5f
Add test for is_compatible_model
2020-05-22 16:55:15 +02:00
Ines Montani
12b7be1d98
Remove jsonschema from dependencies
2020-05-22 16:49:26 +02:00
Matthew Honnibal
7a73a9dcf6
Merge pull request #5488 from explosion/feature/better-model-compat
...
Better model compatibility and validation
2020-05-22 16:44:29 +02:00
Matthew Honnibal
f7f6df7275
Move to spacy.analysis
2020-05-22 16:43:18 +02:00
Matthew Honnibal
78d79d94ce
Guess set_annotations=True in nlp.update
...
During `nlp.update`, components can be passed a boolean set_annotations
to indicate whether they should assign annotations to the `Doc`. This
needs to be called if downstream components expect to use the
annotations during training, e.g. if we wanted to use tagger features in
the parser.
Components can specify their assignments and requirements, so we can
figure out which components have these inter-dependencies. After
figuring this out, we can guess whether to pass set_annotations=True.
We could also call set_annotations=True always, or even just have this
as the only behaviour. The downside of this is that it would require the
`Doc` objects to be created afresh to avoid problematic modifications.
One approach would be to make a fresh copy of the `Doc` objects within
`nlp.update()`, so that we can write to the objects without any
problems. If we do that, we can drop this logic and also drop the
`set_annotations` mechanism. I would be fine with that approach,
although it runs the risk of introducing some performance overhead, and
we'll have to take care to copy all extension attributes etc.
2020-05-22 15:55:45 +02:00
Ines Montani
6728747f71
Merge pull request #5486 from explosion/fix/compat-py2
2020-05-22 15:47:21 +02:00
Ines Montani
6e6db6afb6
Better model compatibility and validation
2020-05-22 15:42:46 +02:00
Matthew Honnibal
f6078d866a
Merge pull request #5121 from adrianeboyd/bugfix/revert-token-match
...
Revert token_match priority changes from #4374 and extend token match options
2020-05-22 14:42:51 +02:00
Ines Montani
c685ee734a
Fix compat for v2.x branch
2020-05-22 14:22:36 +02:00
Ines Montani
65c7e82de2
Auto-format and remove 2.3 feature [ci skip]
2020-05-22 13:50:30 +02:00
Matthew Honnibal
8cb16c7120
Merge pull request #5485 from adrianeboyd/bugfix/retokenizer-merge-0-length-5450
...
Disallow merging 0-length spans
2020-05-22 13:28:35 +02:00
Adriane Boyd
e4a1b5dab1
Rename to url_match
...
Rename to `url_match` and update docs.
2020-05-22 12:41:03 +02:00