Matthew Honnibal
563f46f026
Fix multi-label support for text classification
...
The TextCategorizer class is supposed to support multi-label
text classification, and allow training data to contain missing
values.
For this to work, the gradient of the loss should be 0 when labels
are missing. Instead, there was no way to actually denote "missing"
in the GoldParse class, and so the TextCategorizer class treated
the label set within gold.cats as complete.
To fix this, we change GoldParse.cats to be a dict instead of a list.
The GoldParse.cats dict should map to floats, with 1. denoting
'present' and 0. denoting 'absent'. Gradients are zeroed for categories
absent from the gold.cats dict. A nice bonus is that you can also set
values between 0 and 1 for partial membership. You can also set numeric
values, if you're using a text classification model that uses an
appropriate loss function.
Unfortunately this is a breaking change; although the functionality
was only recently introduced and hasn't been properly documented
yet. I've updated the example script accordingly.
2017-10-05 18:43:02 -05:00
Matthew Honnibal
fb75eb52f1
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-04 16:37:00 +02:00
Matthew Honnibal
40edb65ee7
Make test work for Python 2.7
2017-10-04 16:36:50 +02:00
ines
bb13aa4bf3
Fix typos in PhraseMatcher docs
2017-10-04 16:12:09 +02:00
Matthew Honnibal
bd8e84998a
Add nO attribute to TextCategorizer model
2017-10-04 16:07:30 +02:00
Matthew Honnibal
f8a0614527
Improve textcat model slightly
2017-10-04 15:15:53 +02:00
Matthew Honnibal
f1b86dff8c
Update textcat example
2017-10-04 15:12:28 +02:00
Matthew Honnibal
39798b0172
Uncomment layernorm adjustment hack
2017-10-04 15:12:09 +02:00
Matthew Honnibal
b3a7082bf8
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-10-04 14:56:46 +02:00
Matthew Honnibal
db05d4d582
Add test for #1380 . Passes without fix?
2017-10-04 14:56:31 +02:00
Matthew Honnibal
79a94bc166
Update textcat exampe
2017-10-04 14:55:30 +02:00
Matthew Honnibal
774f5732bd
Fix dimensionality of textcat when no vectors available
2017-10-04 14:55:15 +02:00
Ines Montani
28ba0b9b51
Merge pull request #1385 from explosion/feature/new-website
...
💫 New spaCy website
2017-10-04 14:35:52 +02:00
ines
33cf9cecdd
Port over changes from #1386
2017-10-04 13:34:03 +02:00
Matthew Honnibal
af75b74208
Unset LayerNorm backwards compat hack
2017-10-03 20:47:10 -05:00
ines
36ff525ff5
Add NER P and NER R scores to model overview
2017-10-04 00:37:15 +02:00
ines
15ec7ddd09
Add docs for new spacy evaluate command
2017-10-04 00:19:03 +02:00
ines
464f14019d
Fix typos
2017-10-04 00:18:47 +02:00
ines
bfb512f45a
Add website package.json and fix gitignore
2017-10-04 00:18:41 +02:00
ines
73ac0aa0b5
Update spacy evaluate and add displaCy option
2017-10-04 00:03:15 +02:00
Matthew Honnibal
f24c2e3a8a
Fix evaluate for non-GPU
2017-10-03 22:47:31 +02:00
Matthew Honnibal
32b9f3d1a6
Require new thinc
2017-10-03 22:17:31 +02:00
Matthew Honnibal
2eb0fe4957
Fix setup.py
2017-10-03 21:40:04 +02:00
Matthew Honnibal
c69b0836a0
Fix fabfile
2017-10-03 21:31:41 +02:00
Matthew Honnibal
252299ca2a
Add sdist command
2017-10-03 21:29:43 +02:00
Matthew Honnibal
5cbefcba17
Set backwards compatibility flag
2017-10-03 20:29:58 +02:00
Matthew Honnibal
5454b20cd7
Update thinc imports for 6.9
2017-10-03 20:07:17 +02:00
ines
80a2fb6193
Update visualizers docs and add submenu
2017-10-03 19:40:39 +02:00
Matthew Honnibal
4a59f6358c
Fix thinc imports
2017-10-03 19:21:26 +02:00
Matthew Honnibal
cbb1fbef80
Update train_ner_standalone example
2017-10-03 18:49:38 +02:00
Matthew Honnibal
e514d6aa0a
Import thinc modules more explicitly, to avoid cycles
2017-10-03 18:49:25 +02:00
Matthew Honnibal
338e1fda0e
Unbreak merge artefact
2017-10-03 09:41:05 -05:00
Matthew Honnibal
1289187279
Fix circular import
2017-10-03 09:33:21 -05:00
Matthew Honnibal
a44c4c3a5b
Add timer to evaluate
2017-10-03 09:15:35 -05:00
Matthew Honnibal
96da86b3e5
Add support for verbose flag to Language
2017-10-03 09:14:57 -05:00
Matthew Honnibal
02586a5243
Add timing to spacy evaluate command
2017-10-03 09:14:34 -05:00
ines
5fb057b575
Fix secondary font stack
2017-10-03 15:45:07 +02:00
ines
e49cd7aeaf
Move import into load to avoid circular imports
2017-10-03 15:22:19 +02:00
ines
b0dfa059db
Update docs link in about.py
2017-10-03 15:19:55 +02:00
ines
b24fbd8aad
Fix titles for social cards
2017-10-03 14:54:33 +02:00
ines
23019d1daa
Add styleguide
2017-10-03 14:28:24 +02:00
ines
319fac14fe
Update global config and landing page
2017-10-03 14:28:18 +02:00
ines
22dd929b65
Add models documentation
2017-10-03 14:28:03 +02:00
ines
808f7ee417
Update API documentation
2017-10-03 14:27:22 +02:00
ines
3f4fd2c5d5
Update usage documentation
2017-10-03 14:26:20 +02:00
ines
9af604f0da
Update layout templates, partials and mixins
2017-10-03 14:20:13 +02:00
ines
49b58d35fd
Update JavaScript
2017-10-03 14:18:49 +02:00
ines
a8ff8423bb
Update image assets, icons and SVGs
...
Move SVG sprite to Jade file and include in template. Only use SVG
symbols for logos.
2017-10-03 14:17:41 +02:00
ines
7d01d7411b
Update web fonts
2017-10-03 14:15:36 +02:00
ines
3e1b971b16
Update CSS
2017-10-03 14:14:52 +02:00