Commit Graph

5318 Commits

Author SHA1 Message Date
Matthew Honnibal
2da16adcc2 Add dropout optin for parser and NER
Dropout can now be specified in the `Parser.update()` method via
the `drop` keyword argument, e.g.

    nlp.entity.update(doc, gold, drop=0.4)

This will randomly drop 40% of features, and multiply the value of the
others by 1. / 0.4. This may be useful for generalising from small data
sets.

This commit also patches the examples/training/train_new_entity_type.py
example, to use dropout and fix the output (previously it did not output
the learned entity).
2017-04-27 13:18:39 +02:00
M. Z. Ferdous (Imran)
c9f9203d5f fix typo, CONLL format
tried to google about connlu format. Saw there is conll format, not connlu.
2017-04-27 16:48:54 +06:00
ines
5aa49971f9 Add French example to models docs 2017-04-27 12:08:47 +02:00
Gregory Howard
92f368f83b Removing extra spaces 2017-04-27 12:02:14 +02:00
Gregory Howard
13b6957c8e Adding unitest for tokenization in french (with title) 2017-04-27 11:53:44 +02:00
Gregory Howard
8ff4682255 correcting tokenizer exception.
Adding tests for lemmatization
2017-04-27 11:52:14 +02:00
Ines Montani
7a894c9ef0 Update README.rst 2017-04-27 11:25:30 +02:00
ines
034ec5710b Fix typo and add Norwegian to alpha languages 2017-04-27 11:24:21 +02:00
Ines Montani
2f918e3004 Update README.rst 2017-04-27 11:18:41 +02:00
Ines Montani
bc88f9865e Remove file (already covered in PR) 2017-04-27 11:17:30 +02:00
Ines Montani
6930ed719d Update CONTRIBUTORS.md 2017-04-27 11:17:06 +02:00
Ines Montani
7da9cefd25 Merge pull request #1022 from luvogels/master
Initial support for Norwegian Bokmål
2017-04-27 11:16:06 +02:00
Ines Montani
c9e592ae6c Add newline 2017-04-27 11:15:41 +02:00
Ines Montani
5942adccc2 Add newline 2017-04-27 11:15:19 +02:00
Ines Montani
4cd9269aef Add newline 2017-04-27 11:15:04 +02:00
Ines Montani
ccf13ecc21 Add newline 2017-04-27 11:14:42 +02:00
Ines Montani
03d2b0cc05 Add newline 2017-04-27 11:14:26 +02:00
Gregory Howard
44cb486849 Adding unitest for tokenization in french (with title) 2017-04-27 10:59:38 +02:00
Leif Uwe Vogelsang
13ce4c96b1 Update luvogels.md 2017-04-27 10:42:07 +02:00
Gregory Howard
ad8129cb45 Improvement of rules now title insentive and have same declaration format 2017-04-27 10:23:56 +02:00
Leif Uwe Vogelsang
e136c51393 Update Alpha_support_Norwegian bokmål.md 2017-04-26 23:24:11 +02:00
luvogels
d12a0b6431 Hooked up tokenizer tests 2017-04-26 23:21:41 +02:00
ines
100846bed3 Fix typo in model list 2017-04-26 21:40:17 +02:00
ines
05bcd61fcf Update README.rst 2017-04-26 20:51:38 +02:00
ines
375edf0bb5 Add list of models and include French 2017-04-26 20:50:27 +02:00
ines
4eacd72bc3 Move list of models to own file 2017-04-26 20:50:27 +02:00
Matthew Honnibal
f0e1606d27 Increment version 2017-04-26 20:25:41 +02:00
luvogels
b331929a7e Merge branch 'master' of https://github.com/luvogels/spaCy 2017-04-26 19:15:48 +02:00
luvogels
8de59ce3b9 Added tokenizer tests 2017-04-26 19:10:18 +02:00
Matthew Honnibal
4d98511db7 Make Span hashable. Closes #1019 2017-04-26 19:01:05 +02:00
Matthew Honnibal
24c4c51f13 Try to make test999 less flakey 2017-04-26 18:42:06 +02:00
Leif Uwe Vogelsang
460094bf09 Update __init__.py 2017-04-26 18:27:55 +02:00
luvogels
cbfe4920bb Added contributor agreement and pull request doc 2017-04-26 18:02:34 +02:00
ines
527d51ac9a Fetch shortcuts from GitHub and improve error handling 2017-04-26 18:00:28 +02:00
ines
c2006166d3 Update list of available models and info 2017-04-26 16:03:41 +02:00
ines
5a470367df Add mixin for model row in model docs 2017-04-26 16:03:17 +02:00
ines
5d598b6747 Add star icon 2017-04-26 16:03:05 +02:00
ines
6c4f3c6fc2 Allow styles arguments on row mixin 2017-04-26 16:02:59 +02:00
ines
99558023fd Add divider table row style 2017-04-26 16:02:44 +02:00
ines
e6bdf5bc5c Update adding language / training docs (see #966)
Add data examples and more info on training and CLI commands
2017-04-26 14:01:19 +02:00
ines
ae2b77db1b Fix info on naming conventions 2017-04-26 14:01:19 +02:00
Gregory Howard
ed5f094451 Adding insensitive lemmatisation test 2017-04-25 18:07:02 +02:00
ghoward
26e31afc18 renamming tests 2017-04-25 17:46:01 +02:00
ghoward
c085c2d391 Adding some unitests 2017-04-25 17:44:16 +02:00
ghoward
55c6910f90 Look_up table for languages in spacy.
Need to find an another name for lemmatizerlookup. I was not inspired.
Trying to uses new files in fr language.
2017-04-24 16:39:00 +02:00
Matthew Honnibal
37398e4ed3 Merge pull request #1014 from julien-c/confusion-deeplearning
Make object of the deep learning tutorial clearer
2017-04-24 12:06:43 +02:00
Julien Chaumond
f997bceb07 Make object of the deep learning tutorial clearer
This is a great tutorial, but I think it is weirdly explained in the current form. The largest part of the code is about implementing the actual sentiment analysis model, not about counting entities. (which is not even present in the `deep_learning_keras.py` script in `examples`)
2017-04-24 11:55:41 +02:00
Matthew Honnibal
c4be9c36fe Fix unicode header in tests 2017-04-24 10:09:01 +02:00
Matthew Honnibal
65f10b53e5 Fix test 2017-04-24 00:25:55 +02:00
Matthew Honnibal
70a43858e1 Fix flakey test 2017-04-24 00:06:30 +02:00