Sofie Van Landeghem
7de3b129ab
Resolve edge case when calling textcat.predict with empty doc ( #4035 )
...
* resolve edge case where no doc has tokens when calling textcat.predict
* more explicit value test
2019-07-30 14:58:01 +02:00
Matthew Honnibal
89c92c65fb
Update version
2019-07-28 17:56:38 +02:00
Matthew Honnibal
06eb428ed1
Make pipe base class a bit less presumptuous
2019-07-28 17:56:11 +02:00
Matthew Honnibal
16b5144095
Don't raise NotImplemented in Pipe.update
2019-07-28 17:54:11 +02:00
Ines Montani
fc69da0acb
💫 Support simple training format in nlp.evaluate and add tests ( #4033 )
...
* Support simple training format in nlp.evaluate and add tests
* Update docs [ci skip]
2019-07-27 17:30:18 +02:00
Ines Montani
a3723f439c
Fix formatting [ci skip]
2019-07-27 16:35:42 +02:00
Ines Montani
d5bce35fb1
Fix bug in Span.similarity when called via hook
2019-07-27 15:33:27 +02:00
Ines Montani
109b5e1798
Fix bug in Token.similarity when called via hook
2019-07-27 15:26:01 +02:00
Ines Montani
e000b5ed82
Also support "requirements" in model.json
2019-07-27 13:34:57 +02:00
Ines Montani
307ffe472d
Support custom language factory setting in meta.json ( #4031 )
2019-07-27 13:17:43 +02:00
Bae Yong-Ju
05fbf5d976
Fix error when Korean text contains regexp special characters. ( #4022 )
2019-07-25 17:53:33 +02:00
Matthew Honnibal
73e095923f
💫 Improve error message when model.from_bytes() dies ( #4014 )
...
* Improve error message when model.from_bytes() dies
When Thinc's model.from_bytes() is called with a mismatched model, often
we get a particularly ungraceful error,
e.g. "AttributeError: FunctionLayer has no attribute G"
This is because we're trying to load the parameters for something like
a LayerNorm layer, and the model architecture has some other layer there
instead. This is obviously terrible, especially since the error *type*
is wrong.
I've changed it to raise a ValueError. The error message is still
probably a bit terse, but it's hard to be sure exactly what's gone
wrong.
* Update spacy/pipeline/pipes.pyx
* Update spacy/pipeline/pipes.pyx
* Update spacy/pipeline/pipes.pyx
* Update spacy/syntax/nn_parser.pyx
* Update spacy/syntax/nn_parser.pyx
* Update spacy/pipeline/pipes.pyx
Co-Authored-By: Matthew Honnibal <honnibal+gh@gmail.com>
* Update spacy/pipeline/pipes.pyx
Co-Authored-By: Matthew Honnibal <honnibal+gh@gmail.com>
Co-authored-by: Ines Montani <ines@ines.io>
2019-07-24 11:27:34 +02:00
Ines Montani
87fcf3141c
Merge pull request #4003 from svlandeg/feature/nel-fixes
...
API changes for Entity linking functionality
2019-07-23 23:17:07 +02:00
Paul O'Leary McCann
c8949ce88a
Remove old comment ( #4012 )
...
Norwegian used to borrow from French but that doesn't appear to have
been true for a while now, so the comment that was here is no longer
relevant.
2019-07-23 23:10:06 +02:00
Sofie Van Landeghem
ba02957c80
Fix dependency copy for as_doc ( #3969 )
...
* failing unit test for issue 3962
* attempt to fix Issue #3962
* create artificial unit test example
* using length instead of self.length
* sp
* reformat with black
* find better ancestor within span and use generic 'dep'
* attach to span.root if there is no appropriate ancestor
* comment span text
* clean up ancestor code
* reconstruct dep tree to keep same number of sentences
2019-07-23 18:28:54 +02:00
svlandeg
4e7ec1ed31
return fix
2019-07-23 14:23:58 +02:00
svlandeg
400ff342cf
replace assert's with custom error messages
2019-07-23 11:52:48 +02:00
svlandeg
20389e4553
format and bugfix
2019-07-22 15:08:17 +02:00
svlandeg
b1911f7105
Errors.E146 for IO error when FP is null
2019-07-22 14:56:13 +02:00
svlandeg
5d544f89ba
Errors.E145 for IO errors when reading KB
2019-07-22 14:36:07 +02:00
Ines Montani
a32b033b8c
Add regression test for #4002
...
Test that the PhraseMatcher can match on overwritten NORM attributes.
2019-07-22 14:18:24 +02:00
svlandeg
ad65171837
Merge remote-tracking branch 'upstream/master' into feature/nel-fixes
2019-07-22 13:41:28 +02:00
svlandeg
76184374e2
test corner cases
2019-07-22 13:39:32 +02:00
svlandeg
9f8c1e71a2
fix for Issue #4000
2019-07-22 13:34:12 +02:00
svlandeg
dae8a21282
rename entity frequency
2019-07-19 17:40:28 +02:00
svlandeg
41fb5204ba
output tensors as part of predict
2019-07-19 14:47:36 +02:00
svlandeg
21176517a7
have gold.links correspond exactly to doc.ents
2019-07-19 12:36:15 +02:00
BreakBB
3e370cf2ba
Add 'Prof.' to Englisch tokenizer_exceptions
2019-07-19 10:00:45 +02:00
svlandeg
e1213eaf6a
use original gold object in get_loss function
2019-07-18 13:35:10 +02:00
svlandeg
ec55d2fccd
filter training data beforehand (+black formatting)
2019-07-18 10:22:24 +02:00
Falak Asad
ff1e73e35c
Bugfix/issue 3968 ( #3982 )
...
* Fix for issue-3968
* Added contributor agreement
* Made suggested changes
2019-07-18 00:20:32 +02:00
svlandeg
d833d4c358
fixes in kb and gold
2019-07-17 17:18:26 +02:00
Ines Montani
73565c6d9d
Rename function arguments
2019-07-17 14:29:52 +02:00
Matthew Honnibal
394e4d8058
Add docstring for spacy.gold.align
2019-07-17 13:59:17 +02:00
Ines Montani
073013f129
Auto-format [ci skip]
2019-07-17 12:34:13 +02:00
svlandeg
4086c6ff60
get vector functionality + unit test
2019-07-17 12:17:02 +02:00
Ines Montani
62ff128888
Add regression test for #3951
2019-07-16 14:00:00 +02:00
Ines Montani
7f551050b1
Add regression test for #3972
2019-07-16 13:07:35 +02:00
svlandeg
a63d15a142
code cleanup
2019-07-15 17:36:43 +02:00
svlandeg
cdc589d344
small fix
2019-07-15 12:04:45 +02:00
svlandeg
60f299374f
set default context width
2019-07-15 12:03:09 +02:00
svlandeg
6e809e9b8b
proper error for missing cfg arguments
2019-07-15 11:42:50 +02:00
svlandeg
6026958957
tokenizer doc fix
2019-07-15 11:19:34 +02:00
Ines Montani
c0e29f7029
Merge pull request #3957 from sorenlind/danish-tokenizer-slash
...
Make Danish tokenizer split on forward slash
2019-07-12 18:19:22 +02:00
Matthew Honnibal
ef666656b3
Fix attrs alignment
2019-07-12 17:59:47 +02:00
Matthew Honnibal
c345c042b0
Fix symbol alignment
2019-07-12 17:48:38 +02:00
Ines Montani
7281026879
Increment version [ci skip]
2019-07-12 17:40:00 +02:00
Søren Lind Kristiansen
26aee70d95
Make Danish tokenizer split on forward slash
2019-07-12 15:20:42 +02:00
Matthew Honnibal
3bc4d618f9
Set version to v2.1.5
2019-07-12 13:26:12 +02:00
Sofie Van Landeghem
ed774cb953
Fixing ngram bug ( #3953 )
...
* minimal failing example for Issue #3661
* referenced Issue #3661 instead of Issue #3611
* cleanup
2019-07-12 10:01:35 +02:00