Matthew Honnibal
fc06b0a333
Fix training when hist_size==0
2017-10-05 21:52:28 -05:00
Matthew Honnibal
0e1adacaff
Merge pull request #1390 from mdcclv/contributor-mdcclv
...
Contributor agreement for Orion Montoya @mdcclv
2017-10-06 02:39:08 +02:00
Matthew Honnibal
e25ffcb11f
Move history size under feature flags
2017-10-05 19:38:13 -05:00
Matthew Honnibal
563f46f026
Fix multi-label support for text classification
...
The TextCategorizer class is supposed to support multi-label
text classification, and allow training data to contain missing
values.
For this to work, the gradient of the loss should be 0 when labels
are missing. Instead, there was no way to actually denote "missing"
in the GoldParse class, and so the TextCategorizer class treated
the label set within gold.cats as complete.
To fix this, we change GoldParse.cats to be a dict instead of a list.
The GoldParse.cats dict should map to floats, with 1. denoting
'present' and 0. denoting 'absent'. Gradients are zeroed for categories
absent from the gold.cats dict. A nice bonus is that you can also set
values between 0 and 1 for partial membership. You can also set numeric
values, if you're using a text classification model that uses an
appropriate loss function.
Unfortunately this is a breaking change; although the functionality
was only recently introduced and hasn't been properly documented
yet. I've updated the example script accordingly.
2017-10-05 18:43:02 -05:00
Orion Montoya
e04e11070f
Contributor agreement for Orion Montoya @mdcclv
2017-10-05 17:45:45 -04:00
Ines Montani
e77d8886f7
Update CONTRIBUTORS.md
2017-10-05 22:22:04 +02:00
Matthew Honnibal
dea81f113d
Merge pull request #1389 from mdcclv/lemmatizer_obey_exceptions
...
Lemmatizer obey exceptions
2017-10-05 22:11:21 +02:00
Matthew Honnibal
c36d4596bf
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-05 18:27:56 +02:00
Matthew Honnibal
056b08c0df
Delete obsolete nn_text_class example
2017-10-05 18:27:10 +02:00
Orion Montoya
b0d271809d
Unit test for lemmatizer exceptions -- copied from regression test for #1387
2017-10-05 10:49:28 -04:00
Orion Montoya
ffb50d21a0
Lemmatizer honors exceptions: Fix #1387
2017-10-05 10:49:02 -04:00
Orion Montoya
e81a608173
Regression test for lemmatizer exceptions -- demonstrate issue #1387
2017-10-05 10:47:48 -04:00
Matthew Honnibal
c6cd81f192
Wrap try/except around model saving
2017-10-05 08:14:24 -05:00
Matthew Honnibal
5743b06e36
Wrap model saving in try/except
2017-10-05 08:12:50 -05:00
Matthew Honnibal
fd4baff475
Update tests
2017-10-05 08:12:27 -05:00
Matthew Honnibal
dcdfa071aa
Disable LayerNorm hack
2017-10-04 20:06:52 -05:00
Matthew Honnibal
943af4423a
Make depth setting in parser work again
2017-10-04 20:06:05 -05:00
Matthew Honnibal
bfabc333be
Merge remote-tracking branch 'origin/develop' into feature/parser-history-model
2017-10-04 20:00:36 -05:00
Matthew Honnibal
92066b04d6
Fix Embed and HistoryFeatures
2017-10-04 19:55:34 -05:00
ines
b621a2e964
Fix build emoji
2017-10-04 18:37:27 +02:00
Matthew Honnibal
5560c46a59
Update buildkite
2017-10-04 18:29:41 +02:00
Matthew Honnibal
e3c93f87a4
Update sdist
2017-10-04 18:18:07 +02:00
Matthew Honnibal
c4c7def9ce
Fix yml
2017-10-04 18:14:33 +02:00
Matthew Honnibal
71825f9737
Fix yml
2017-10-04 18:12:16 +02:00
Matthew Honnibal
6304c5e146
Fix yml
2017-10-04 18:08:34 +02:00
Matthew Honnibal
ff24b6d04a
Fix yml
2017-10-04 18:05:45 +02:00
Matthew Honnibal
cc29e8b497
Add buildkite.yml for making sdists
2017-10-04 18:00:37 +02:00
Matthew Honnibal
d903986439
Increment version
2017-10-04 17:14:26 +02:00
Matthew Honnibal
fb75eb52f1
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-04 16:37:00 +02:00
Matthew Honnibal
40edb65ee7
Make test work for Python 2.7
2017-10-04 16:36:50 +02:00
ines
bb13aa4bf3
Fix typos in PhraseMatcher docs
2017-10-04 16:12:09 +02:00
Matthew Honnibal
bd8e84998a
Add nO attribute to TextCategorizer model
2017-10-04 16:07:30 +02:00
Matthew Honnibal
f8a0614527
Improve textcat model slightly
2017-10-04 15:15:53 +02:00
Matthew Honnibal
f1b86dff8c
Update textcat example
2017-10-04 15:12:28 +02:00
Matthew Honnibal
39798b0172
Uncomment layernorm adjustment hack
2017-10-04 15:12:09 +02:00
Matthew Honnibal
b3a7082bf8
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-04 14:56:46 +02:00
Matthew Honnibal
db05d4d582
Add test for #1380 . Passes without fix?
2017-10-04 14:56:31 +02:00
Matthew Honnibal
79a94bc166
Update textcat exampe
2017-10-04 14:55:30 +02:00
Matthew Honnibal
774f5732bd
Fix dimensionality of textcat when no vectors available
2017-10-04 14:55:15 +02:00
Ines Montani
28ba0b9b51
Merge pull request #1385 from explosion/feature/new-website
...
💫 New spaCy website
2017-10-04 14:35:52 +02:00
Ines Montani
678651ca98
Merge pull request #1386 from kokes/patch-1
...
Fixing links to SyntaxNet
2017-10-04 13:35:01 +02:00
ines
33cf9cecdd
Port over changes from #1386
2017-10-04 13:34:03 +02:00
Ondrej Kokes
a9362f1c73
Fixing links to SyntaxNet
2017-10-04 12:55:07 +02:00
Matthew Honnibal
af75b74208
Unset LayerNorm backwards compat hack
2017-10-03 20:47:10 -05:00
ines
36ff525ff5
Add NER P and NER R scores to model overview
2017-10-04 00:37:15 +02:00
ines
15ec7ddd09
Add docs for new spacy evaluate command
2017-10-04 00:19:03 +02:00
ines
464f14019d
Fix typos
2017-10-04 00:18:47 +02:00
ines
bfb512f45a
Add website package.json and fix gitignore
2017-10-04 00:18:41 +02:00
ines
73ac0aa0b5
Update spacy evaluate and add displaCy option
2017-10-04 00:03:15 +02:00
Matthew Honnibal
246612cb53
Merge remote-tracking branch 'origin/develop' into feature/parser-history-model
2017-10-03 16:56:42 -05:00