Matthew Honnibal
7a73a9dcf6
Merge pull request #5488 from explosion/feature/better-model-compat
...
Better model compatibility and validation
2020-05-22 16:44:29 +02:00
Matthew Honnibal
f7f6df7275
Move to spacy.analysis
2020-05-22 16:43:18 +02:00
Matthew Honnibal
78d79d94ce
Guess set_annotations=True in nlp.update
...
During `nlp.update`, components can be passed a boolean set_annotations
to indicate whether they should assign annotations to the `Doc`. This
needs to be called if downstream components expect to use the
annotations during training, e.g. if we wanted to use tagger features in
the parser.
Components can specify their assignments and requirements, so we can
figure out which components have these inter-dependencies. After
figuring this out, we can guess whether to pass set_annotations=True.
We could also call set_annotations=True always, or even just have this
as the only behaviour. The downside of this is that it would require the
`Doc` objects to be created afresh to avoid problematic modifications.
One approach would be to make a fresh copy of the `Doc` objects within
`nlp.update()`, so that we can write to the objects without any
problems. If we do that, we can drop this logic and also drop the
`set_annotations` mechanism. I would be fine with that approach,
although it runs the risk of introducing some performance overhead, and
we'll have to take care to copy all extension attributes etc.
2020-05-22 15:55:45 +02:00
Ines Montani
6728747f71
Merge pull request #5486 from explosion/fix/compat-py2
2020-05-22 15:47:21 +02:00
Ines Montani
6e6db6afb6
Better model compatibility and validation
2020-05-22 15:42:46 +02:00
Matthew Honnibal
f6078d866a
Merge pull request #5121 from adrianeboyd/bugfix/revert-token-match
...
Revert token_match priority changes from #4374 and extend token match options
2020-05-22 14:42:51 +02:00
Ines Montani
c685ee734a
Fix compat for v2.x branch
2020-05-22 14:22:36 +02:00
Ines Montani
f30b9d3038
Merge branch 'master' into spacy.io
2020-05-22 13:50:37 +02:00
Ines Montani
65c7e82de2
Auto-format and remove 2.3 feature [ci skip]
2020-05-22 13:50:30 +02:00
Matthew Honnibal
8cb16c7120
Merge pull request #5485 from adrianeboyd/bugfix/retokenizer-merge-0-length-5450
...
Disallow merging 0-length spans
2020-05-22 13:28:35 +02:00
Adriane Boyd
e4a1b5dab1
Rename to url_match
...
Rename to `url_match` and update docs.
2020-05-22 12:41:03 +02:00
Adriane Boyd
730fa493a4
Merge remote-tracking branch 'upstream/master' into bugfix/revert-token-match
2020-05-22 12:18:00 +02:00
Adriane Boyd
71fe61fdcd
Disallow merging 0-length spans
2020-05-22 10:14:34 +02:00
Matthew Honnibal
93c4d13588
Merge pull request #5264 from lfiedler/issue-5230
...
Fix ResourceWarnings during unittest
2020-05-22 00:31:07 +02:00
Matthew Honnibal
e1cb7e838b
Merge pull request #5481 from explosion/feature/blank-shortcut-v2
...
Add blank:{lang} shortcut support to util.load_model
2020-05-22 00:08:23 +02:00
Ines Montani
85064b5c22
Merge branch 'master' into spacy.io
2020-05-21 21:55:04 +02:00
Ines Montani
ee027de032
Update universe and display of videos [ci skip]
2020-05-21 21:54:23 +02:00
Ines Montani
2250380816
Merge pull request #5482 from explosion/fix/backwards-compat-super
2020-05-21 21:51:46 +02:00
Ines Montani
dc94052d6e
Merge branch 'master' into spacy.io
2020-05-21 21:01:32 +02:00
Ines Montani
5753b43e60
Tidy up and fix alignment of landing cards ( #5317 )
2020-05-21 20:56:04 +02:00
Ines Montani
891fa59009
Use backwards-compatible super()
2020-05-21 20:52:48 +02:00
Matthew Honnibal
5ce02c1b17
Merge pull request #5470 from svlandeg/bugfix/noun-chunks
...
Bugfix in noun chunks
2020-05-21 20:51:31 +02:00
Ines Montani
32c2bb3d99
Add course to landing [ci skip]
2020-05-21 20:50:17 +02:00
Matthw Honnibal
25b51f4fc8
Set version to v3.0.0.dev9
2020-05-21 20:47:52 +02:00
Matthw Honnibal
bc94fdabd0
Fix begin_training
2020-05-21 20:46:21 +02:00
Matthw Honnibal
d507ac28d8
Fix shape inference
2020-05-21 20:46:10 +02:00
Ines Montani
53da6bd672
Add course to landing [ci skip]
2020-05-21 20:45:33 +02:00
Ines Montani
cb02bff0eb
Add blank:{lang} shortcut to util.load_mode
2020-05-21 20:24:07 +02:00
Matthw Honnibal
df87c32a40
Pass smaller doc sample into model initialize
2020-05-21 20:17:24 +02:00
Ines Montani
581bda9f98
Update senter test and auto-format
2020-05-21 20:17:14 +02:00
Ines Montani
0f1beb5ff2
Tidy up and avoid absolute spacy imports in core
2020-05-21 20:05:03 +02:00
svlandeg
51715b9f72
span / noun chunk has +1 because end is exclusive
2020-05-21 19:56:56 +02:00
Adriane Boyd
132b2a6898
Merge remote-tracking branch 'upstream/master-tmp' into HEAD
2020-05-21 19:50:30 +02:00
Adriane Boyd
17ee9ab53a
Fix _SP/POS=SPACE in strings serialization tests
2020-05-21 19:49:08 +02:00
Ines Montani
245f91df78
Fix merge issues
2020-05-21 19:42:13 +02:00
Matthw Honnibal
3b5cfec1fc
Tweak memory management in train_from_config
2020-05-21 19:32:04 +02:00
Matthw Honnibal
f075655deb
Fix shape inference in begin_training
2020-05-21 19:26:29 +02:00
svlandeg
84d5b7ad0a
Merge remote-tracking branch 'upstream/master' into bugfix/noun-chunks
...
# Conflicts:
# spacy/lang/el/syntax_iterators.py
# spacy/lang/en/syntax_iterators.py
# spacy/lang/fa/syntax_iterators.py
# spacy/lang/fr/syntax_iterators.py
# spacy/lang/id/syntax_iterators.py
# spacy/lang/nb/syntax_iterators.py
# spacy/lang/sv/syntax_iterators.py
2020-05-21 19:19:50 +02:00
svlandeg
f7d10da555
avoid unnecessary loop to check overlapping noun chunks
2020-05-21 19:15:57 +02:00
Matthw Honnibal
1729165e90
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2020-05-21 19:11:08 +02:00
Ines Montani
631e20d0c6
Fix test and schemas
2020-05-21 19:01:02 +02:00
Ines Montani
d34fc0915e
Remove serialization getter
2020-05-21 18:48:21 +02:00
Ines Montani
f44897e4c6
Update warning IDs
2020-05-21 18:39:11 +02:00
Ines Montani
24f72c669c
Merge branch 'develop' into master-tmp
2020-05-21 18:39:06 +02:00
Ines Montani
c6ec19c844
Add missing declaration
2020-05-21 17:30:05 +02:00
Matthew Honnibal
884d9b060d
Merge pull request #5466 from adrianeboyd/feature/omit-extra-lexeme-info
...
Add option to omit extra lexeme tables in CLI
2020-05-21 16:40:02 +02:00
Matthew Honnibal
e6c4c1a507
Merge pull request #5468 from adrianeboyd/feature/cli-conllu-misc-ner
...
Improve handling of NER in CoNLL-U MISC
2020-05-21 16:39:46 +02:00
Matthew Honnibal
26cd6a0229
Merge pull request #5462 from adrianeboyd/feature/lemmatizer-all-upos
...
Extend lemmatizer rules for all UPOS tags
2020-05-21 16:05:31 +02:00
Matthew Honnibal
cad9b290a2
Merge branch 'master' into feature/omit-extra-lexeme-info
2020-05-21 16:04:24 +02:00
Matthew Honnibal
1f572ce89b
Merge pull request #5473 from explosion/fix/travis-tests
...
Fix Python 2.7 compat
2020-05-21 15:56:16 +02:00