Leo
925e938570
Spanish tokenizer exception and examples improvement ( #5531 )
...
* Spanish tokenizer exception additions. Added Spanish question examples
* erased slang tokenization examples
2020-06-01 18:18:34 +02:00
Matthew Honnibal
67af3a32b0
Merge pull request #5527 from adrianeboyd/bugfix/tagger-sp-tag-map
...
Preserve _SP when filtering tag map in Tagger
2020-06-01 12:00:21 +02:00
Leo
c21c308ecb
corrected issue #5524 changed <U+009C> 'STRING TERMINATOR' for <U+0153> LATIN SMALL LIGATURE OE' ( #5526 )
2020-05-31 22:08:12 +02:00
Leo
7d5a89661e
contributor agreement signed ( #5525 )
2020-05-31 20:13:39 +02:00
Adriane Boyd
a005ccd6d7
Preserve _SP when filtering tag map in Tagger
...
To allow "SP" as a tag (for Chinese OntoNotes), preserve "_SP" if
present as the reference `SPACE` POS in the tag map in
`Tagger.begin_training()`.
2020-05-31 19:57:54 +02:00
Matthew Honnibal
758a4b154d
Merge pull request #5521 from svlandeg/bugfix/vectors-from-disk
...
fix deserialization order
2020-05-30 18:38:23 +02:00
svlandeg
15134ef611
fix deserialization order
2020-05-30 12:53:32 +02:00
Matthew Honnibal
64adda3202
Revert "Remove peeking from Parser.begin_training ( #5456 )"
...
This reverts commit 9393253b66
.
The model shouldn't need to see all examples, and actually in v3 there's
no equivalent step. All examples are provided to the component, for the
component to do stuff like figuring out the labels. The model just needs
to do stuff like shape inference.
2020-05-29 23:21:55 +02:00
Matthew Honnibal
85f1acfaa0
Merge pull request #5517 from adrianeboyd/bugfix/morph-repr
...
Remove MorphAnalysis __str__ and __repr__
2020-05-29 19:20:56 +02:00
Matthew Honnibal
2a8137aba9
Merge pull request #5518 from svlandeg/fix/pretrain-docs
...
Pretrain fixes
2020-05-29 19:20:20 +02:00
svlandeg
291483157d
prevent loading a pretrained Tok2Vec layer AND pretrained components
2020-05-29 17:38:33 +02:00
Adriane Boyd
e1b7cbd197
Remove MorphAnalysis __str__ and __repr__
2020-05-29 14:33:47 +02:00
svlandeg
04ba37b667
fix description
2020-05-29 13:52:39 +02:00
svlandeg
5f0a91cf37
fix conv-depth parameter
2020-05-29 09:56:29 +02:00
Matthew Honnibal
aecd1437cc
Merge pull request #5508 from adrianeboyd/bugfix/tag-map-sp-tag
...
Prefer _SP over SP for default tag map space attrs
2020-05-27 20:39:40 +02:00
Matthew Honnibal
e7ac12b598
Merge pull request #5514 from adrianeboyd/bugfix/load-vector-name
...
Improve vector name loading from model meta
2020-05-27 20:39:23 +02:00
Adriane Boyd
25de2a2191
Improve vector name loading from model meta
2020-05-27 14:48:54 +02:00
adrianeboyd
aad0610a85
Map NR to PROPN ( #5512 )
2020-05-26 22:30:53 +02:00
Sofie Van Landeghem
f00488ab30
Update train_intent_parser.py
2020-05-26 16:41:39 +02:00
Adriane Boyd
b6b5908f5e
Prefer _SP over SP for default tag map space attrs
...
If `_SP` is already in the tag map, use the mapping from `_SP` instead
of `SP` so that `SP` can be a valid non-space tag. (Chinese has a
non-space tag `SP` which was overriding the mapping of `_SP` to
`SPACE`.)
2020-05-26 14:57:13 +02:00
Matthew Honnibal
b0c0271a48
Merge pull request #5506 from adrianeboyd/bugfix/pl-lemmatizer-lookup-loading
...
Fix Polish lemmatizer for deserialized models
2020-05-26 12:31:25 +02:00
Adriane Boyd
1eed101be9
Fix Polish lemmatizer for deserialized models
...
Restructure Polish lemmatizer not to depend on lookups data in
`__init__` since the lemmatizer is initialized before the lookups data
is loaded from a saved model. The lookups tables are accessed first in
`__call__` instead once the data is available.
2020-05-26 09:56:12 +02:00
adrianeboyd
69897b45d8
Handle spacy.pex renaming in Makefile ( #5503 )
2020-05-25 16:39:22 +02:00
adrianeboyd
c9c7b135c0
Update Makefile for v2.3.0 ( #5502 )
2020-05-25 15:24:24 +02:00
Ines Montani
24ef6680fa
Merge pull request #5499 from adrianeboyd/chore/bump-version-deps-v2.3.0
2020-05-25 13:25:45 +02:00
Ines Montani
ade4767e06
Merge pull request #5498 from adrianeboyd/bugfix/phrasematcher-unpickle-new-api
2020-05-25 13:25:07 +02:00
Adriane Boyd
3f727bc539
Switch to v2.3.0.dev0
2020-05-25 12:57:20 +02:00
Adriane Boyd
736f3cb5af
Bump version and deps for v2.3.0
...
* spacy to v2.3.0
* thinc to v7.4.1
* spacy-lookups-data to v0.3.2
2020-05-25 12:03:49 +02:00
Rajat
8b8efa1b42
update spacy universe with my project ( #5497 )
...
* added contextualSpellCheck in spacy universe meta
* removed extra formatting by code
* updated with permanent links
* run json linter used by spacy
* filled SCA
* updated the description
2020-05-25 11:30:23 +02:00
Adriane Boyd
e06ca7ea24
Switch to new add API in PhraseMatcher unpickle
2020-05-25 11:22:47 +02:00
Sofie Van Landeghem
ae1c179f3a
Remove the nested quote
2020-05-23 17:58:19 +02:00
Jannis
aa53ce6996
Documentation Typo Fix ( #5492 )
...
* Fix typo
Change 'realize' to 'realise'
* Add contributer agreement
2020-05-22 19:50:26 +02:00
Ines Montani
6728747f71
Merge pull request #5486 from explosion/fix/compat-py2
2020-05-22 15:47:21 +02:00
Matthew Honnibal
f6078d866a
Merge pull request #5121 from adrianeboyd/bugfix/revert-token-match
...
Revert token_match priority changes from #4374 and extend token match options
2020-05-22 14:42:51 +02:00
Ines Montani
c685ee734a
Fix compat for v2.x branch
2020-05-22 14:22:36 +02:00
Ines Montani
65c7e82de2
Auto-format and remove 2.3 feature [ci skip]
2020-05-22 13:50:30 +02:00
Matthew Honnibal
8cb16c7120
Merge pull request #5485 from adrianeboyd/bugfix/retokenizer-merge-0-length-5450
...
Disallow merging 0-length spans
2020-05-22 13:28:35 +02:00
Adriane Boyd
e4a1b5dab1
Rename to url_match
...
Rename to `url_match` and update docs.
2020-05-22 12:41:03 +02:00
Adriane Boyd
730fa493a4
Merge remote-tracking branch 'upstream/master' into bugfix/revert-token-match
2020-05-22 12:18:00 +02:00
Adriane Boyd
71fe61fdcd
Disallow merging 0-length spans
2020-05-22 10:14:34 +02:00
Matthew Honnibal
93c4d13588
Merge pull request #5264 from lfiedler/issue-5230
...
Fix ResourceWarnings during unittest
2020-05-22 00:31:07 +02:00
Matthew Honnibal
e1cb7e838b
Merge pull request #5481 from explosion/feature/blank-shortcut-v2
...
Add blank:{lang} shortcut support to util.load_model
2020-05-22 00:08:23 +02:00
Ines Montani
ee027de032
Update universe and display of videos [ci skip]
2020-05-21 21:54:23 +02:00
Ines Montani
2250380816
Merge pull request #5482 from explosion/fix/backwards-compat-super
2020-05-21 21:51:46 +02:00
Ines Montani
891fa59009
Use backwards-compatible super()
2020-05-21 20:52:48 +02:00
Matthew Honnibal
5ce02c1b17
Merge pull request #5470 from svlandeg/bugfix/noun-chunks
...
Bugfix in noun chunks
2020-05-21 20:51:31 +02:00
Ines Montani
53da6bd672
Add course to landing [ci skip]
2020-05-21 20:45:33 +02:00
Ines Montani
cb02bff0eb
Add blank:{lang} shortcut to util.load_mode
2020-05-21 20:24:07 +02:00
Ines Montani
0f1beb5ff2
Tidy up and avoid absolute spacy imports in core
2020-05-21 20:05:03 +02:00
svlandeg
51715b9f72
span / noun chunk has +1 because end is exclusive
2020-05-21 19:56:56 +02:00