Matthew Honnibal
1f7229f40f
Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
...
This reverts commit c9ba3d3c2d
, reversing
changes made to 92c26a35d4
.
2018-03-27 19:23:02 +02:00
Justin DuJardin
4eeb178856
Add example using TensorBoard standalone projector
...
- the tensorboard standalone project expects a different set of files than the plugin to TensorFlow.
2018-03-25 21:50:13 -07:00
ines
4ec2809eb5
Port over TensorBoard example
2018-03-24 17:15:48 +01:00
Matthew Honnibal
00557c5fdd
Add example of NER multitask objective
2018-01-21 19:46:37 +01:00
avinash
b379c9d7d3
typos corrected
2018-01-03 16:54:22 +05:30
mpuels
1e8147aec7
fix: Add missing period in train data
2017-12-13 10:51:05 +01:00
mpuels
ee4d6fdd40
Fix typo in comment
2017-12-09 13:14:57 +01:00
ines
726fb2d0b5
Use fewer iterations by default to avoid overfitting on blank model ( resolves #1632 )
2017-11-23 15:27:12 +01:00
ines
ec08996000
Add note on tags matching tokenization (see #1613 )
2017-11-20 15:12:47 +01:00
ines
1a38575de3
Make example Python 2 compatible (see #1617 )
2017-11-20 13:57:51 +01:00
ines
7d5afadf5e
Update vectors_loc description
2017-11-17 14:57:11 +01:00
ines
c57e05bec1
Make sure nr_dim is an int
...
In some languages (e.g. Dutch), the nr_dim is extracted as a byte string, causing an error down the line.
2017-11-17 14:56:27 +01:00
yogendrasoni
334ed433b2
rstrip line before rsplit
...
loading english fast text giving error because line contains new line at the end and rsplit is splitting it incorrectly
2017-11-15 13:55:08 +05:30
Matthew Honnibal
f0e28e8ae5
Make fasttext reader accommodate whitespace
2017-11-12 12:07:13 +01:00
ines
f36fab39b0
Don't rename component in intent parser example ( resolves #1551 )
...
Otherwise, the default saved model won't know that it's supposed to create spaCy's 'parser'.
2017-11-10 23:35:38 +01:00
Ines Montani
1a23a0f87e
Remove broken link ( resolves #1541 )
2017-11-10 12:28:39 +01:00
ines
3597a29c24
Update fastText vectors example (see #1525 )
...
Add option to specify language, and add note on "lang" being required to save out model
2017-11-09 14:54:39 +01:00
ines
33b84f4c39
Change clear_vectors to reset_vectors ( resolves #1516 )
2017-11-08 18:11:23 +01:00
ines
89bd40b821
Fix print statement in textcat training example ( resolves #1515 )
2017-11-08 17:17:40 +01:00
ines
a09c096d3c
Get docs ready for v2.0.0
2017-11-07 12:00:43 +01:00
ines
173b1551af
Update examples
2017-11-07 01:22:30 +01:00
ines
1b1c9105b4
Update example compatibility statements
2017-11-07 01:11:45 +01:00
ines
8fb48b9b91
Update and document new util functions
2017-11-07 00:22:43 +01:00
Matthew Honnibal
d7016d4050
Update intent parser example
2017-11-06 23:31:11 +01:00
ines
fe498b3d5e
Update training examples to use "simple style"
2017-11-06 23:14:04 +01:00
ines
c646365e2f
Port over changes and add note on compat (see #1445 )
2017-11-06 13:58:34 +01:00
ines
2dca9e71a1
Add notes on catastrophic forgetting (see #1496 )
2017-11-06 13:17:02 +01:00
Matthew Honnibal
717e8124fb
Update Keras sentiment analysis example
2017-11-05 17:11:00 +01:00
Matthew Honnibal
cfb83c231c
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2017-11-04 23:08:19 +01:00
Matthew Honnibal
ba0201de07
Update multiprocessing example
2017-11-04 23:07:57 +01:00
ines
70a9504560
Add inbetween print statement
2017-11-04 23:06:55 +01:00
Matthew Honnibal
e033162a1d
Update tagger training example
2017-11-01 21:49:08 +01:00
ines
8f1d3fc3ee
Update textcat example
2017-11-01 17:09:22 +01:00
Matthew Honnibal
dad8f09fba
Fix print statements in text classifier example
2017-11-01 16:34:31 +01:00
ines
bfe17b7df1
Fix begin_training if get_gold_tuples is None
2017-11-01 13:14:31 +01:00
ines
0ca152a015
Fix syntax error
2017-11-01 00:43:28 +01:00
ines
4b196fdf7f
Fix formatting
2017-11-01 00:43:22 +01:00
ines
33af6ac69a
Use even smaller examle size
...
100 was still too much, so try 20 instead
2017-10-30 19:46:45 +01:00
ines
f02b0af821
Fix path and use smaller example size
...
500 was too larger and caused laggy rendering
2017-10-30 19:44:35 +01:00
ines
18dde7869a
Update training data docs and add vocab JSONL
2017-10-30 19:40:05 +01:00
ines
b5643d8575
Update intent parser docs and add to usage docs
2017-10-27 04:49:05 +02:00
ines
9dfca0f2f8
Add example for custom intent parser
2017-10-27 03:55:11 +02:00
ines
4d272e25ee
Fix examples
2017-10-27 03:55:04 +02:00
ines
44f83b35bc
Update pipeline component examples to use plac
2017-10-27 02:58:14 +02:00
ines
af28ca1ba0
Move example to pipeline directory
2017-10-27 02:00:01 +02:00
ines
1d69a46cd4
Update multi-processing example and add to docs
2017-10-27 01:58:55 +02:00
ines
4eabaafd66
Update docstring and example
2017-10-27 01:50:44 +02:00
ines
ed69bd69f4
Update parallel tagging example
2017-10-27 01:48:52 +02:00
ines
096a80170d
Remove old example files
2017-10-27 01:48:39 +02:00
ines
a7b9074b4c
Update textcat training example and docs
2017-10-27 00:48:45 +02:00
ines
b61866a2e4
Update textcat example
2017-10-27 00:32:19 +02:00
ines
f81cc0bd1c
Fix usage of disable_pipes
2017-10-27 00:31:30 +02:00
ines
b7b285971f
Update examples README
2017-10-26 18:47:11 +02:00
ines
cc2917c9e8
Update fastText example and add to examples in docs
2017-10-26 18:47:02 +02:00
ines
db843735d3
Remove outdated examples
2017-10-26 18:46:25 +02:00
ines
daed7ff8fe
Update information extraction examples
2017-10-26 18:46:11 +02:00
ines
bca5372fb1
Clean up examples
2017-10-26 17:32:59 +02:00
ines
f57043e6fe
Update docstring
2017-10-26 16:29:08 +02:00
ines
b90e958975
Update tagger and parser examples and add to docs
2017-10-26 16:27:42 +02:00
ines
f1529463a8
Update tagger training example
2017-10-26 16:19:02 +02:00
ines
e44bbb5361
Remove old example
2017-10-26 16:12:41 +02:00
ines
421c3837e8
Fix formatting
2017-10-26 16:11:25 +02:00
ines
4d896171ae
Use plac annotations for arguments
2017-10-26 16:11:20 +02:00
ines
c3b681e5fb
Use plac annotations for arguments and add n_iter
2017-10-26 16:11:05 +02:00
ines
bc2c92f22d
Use plac annotations for arguments
2017-10-26 16:10:56 +02:00
ines
b5c74dbb34
Update parser training example
2017-10-26 15:15:37 +02:00
ines
586b9047fd
Use create_pipe instead of importing the entity recognizer
2017-10-26 15:15:26 +02:00
ines
d425ede7e9
Fix example
2017-10-26 15:15:08 +02:00
ines
9d58673aaf
Update train_ner example for spaCy v2.0
2017-10-26 14:24:12 +02:00
ines
e904075f35
Remove stray print statements
2017-10-26 14:24:00 +02:00
ines
c30258c3a2
Remove old example
2017-10-26 14:23:52 +02:00
ines
615c315d70
Update train_new_entity_type example to use disable_pipes
2017-10-25 14:56:53 +02:00
ines
2b8e7c45e0
Use better training data JSON example
2017-10-24 16:00:56 +02:00
ines
9bf5751064
Pretty-print JSON
2017-10-24 12:22:17 +02:00
ines
6675755005
Add training data JSON example
2017-10-24 12:05:10 +02:00
Jeroen Bobbeldijk
84c6c20d1c
Fix #1444 : fix pipeline logic and wrong paramater in update call
2017-10-22 15:18:36 +02:00
Jeffrey Gerard
5ba970b495
minor cleanup
2017-10-12 12:34:46 -07:00
Jeffrey Gerard
39d3cbfdba
Bugfix example script train_ner_standalone.py, fails after training
2017-10-12 11:39:12 -07:00
ines
f4ae6763b9
Fix consistency of imports from spacy.tokens in examples
2017-10-11 02:30:40 +02:00
Matthew Honnibal
e0a9b02b67
Merge Span._ and Span.as_doc methods
2017-10-09 22:00:15 -05:00
ines
6679117000
Add pipeline component examples
2017-10-10 04:26:06 +02:00
Matthew Honnibal
e79fc41ff8
Merge pull request #1391 from explosion/feature/multilabel-textcat
...
💫 Fix multi-label support for text classification
2017-10-09 04:22:31 +02:00
Matthew Honnibal
563f46f026
Fix multi-label support for text classification
...
The TextCategorizer class is supposed to support multi-label
text classification, and allow training data to contain missing
values.
For this to work, the gradient of the loss should be 0 when labels
are missing. Instead, there was no way to actually denote "missing"
in the GoldParse class, and so the TextCategorizer class treated
the label set within gold.cats as complete.
To fix this, we change GoldParse.cats to be a dict instead of a list.
The GoldParse.cats dict should map to floats, with 1. denoting
'present' and 0. denoting 'absent'. Gradients are zeroed for categories
absent from the gold.cats dict. A nice bonus is that you can also set
values between 0 and 1 for partial membership. You can also set numeric
values, if you're using a text classification model that uses an
appropriate loss function.
Unfortunately this is a breaking change; although the functionality
was only recently introduced and hasn't been properly documented
yet. I've updated the example script accordingly.
2017-10-05 18:43:02 -05:00
Matthew Honnibal
056b08c0df
Delete obsolete nn_text_class example
2017-10-05 18:27:10 +02:00
Matthew Honnibal
f1b86dff8c
Update textcat example
2017-10-04 15:12:28 +02:00
Matthew Honnibal
79a94bc166
Update textcat exampe
2017-10-04 14:55:30 +02:00
Matthew Honnibal
cbb1fbef80
Update train_ner_standalone example
2017-10-03 18:49:38 +02:00
Matthew Honnibal
38286b6f07
Add example loadig Fast Text vectors
2017-10-01 23:40:02 +02:00
Matthew Honnibal
f92ab03dc8
Rename phrase matcher example
2017-09-20 22:51:58 +02:00
Matthew Honnibal
01858e9b59
Fix PhraseMatcher example
2017-09-20 22:51:41 +02:00
Matthew Honnibal
027a5d8b75
Update train_ner_standalone example
2017-09-15 10:36:46 +02:00
Matthew Honnibal
683d81bb49
Update example for adding entity type
2017-09-14 16:15:59 +02:00
Matthew Honnibal
c16ef0a85c
Clarify train textcat example
2017-07-29 21:59:27 +02:00
Matthew Honnibal
54a539a113
Finish text classifier example
2017-07-23 00:34:12 +02:00
Matthew Honnibal
2bc7d87c70
Add example for training text classifier
2017-07-22 20:15:32 +02:00
ines
992559bf9a
Fix formatting and remove unused imports
2017-06-01 12:47:18 +02:00
Matthew Honnibal
5c30466c95
Update NER training example
2017-05-31 13:42:12 +02:00
akYoung
c158cdb1da
Corretions for model test example
...
The sentences of test data in sentence entailment example should be generated with integers limited to vocab_size.
2017-05-03 22:41:23 +08:00
Matthew Honnibal
2da16adcc2
Add dropout optin for parser and NER
...
Dropout can now be specified in the `Parser.update()` method via
the `drop` keyword argument, e.g.
nlp.entity.update(doc, gold, drop=0.4)
This will randomly drop 40% of features, and multiply the value of the
others by 1. / 0.4. This may be useful for generalising from small data
sets.
This commit also patches the examples/training/train_new_entity_type.py
example, to use dropout and fix the output (previously it did not output
the learned entity).
2017-04-27 13:18:39 +02:00
Matthew Honnibal
0605b95f2e
Merge branch 'master' of https://github.com/explosion/spaCy
2017-04-18 13:48:00 +02:00