Francisco Aranda
70a2180199
fix(spanish sentence segmentation): remove tokenizer exceptions the break sentence segmentation. Aligned with training corpus
2017-06-02 08:19:57 +02:00
Francisco Aranda
5b385e7d78
feat(spanish model): add the spanish noun chunker
2017-06-02 08:14:06 +02:00
oeg
cdaefae60a
feature(populate_vocab): Enable pruning out rare words from clusters data
2017-05-12 16:15:19 +02:00
Ines Montani
6e1fad92a1
Update CONTRIBUTORS.md
2017-05-03 10:01:40 +02:00
ines
e2380d8789
Update README.rst
2017-05-03 10:00:04 +02:00
ines
f9384b0fbd
Update alpha languages and add aside for tokenizer dependencies
2017-05-03 09:58:31 +02:00
Ines Montani
f0d7a87e18
Merge pull request #1035 from uetchy/japanese-support
...
Japanese support
2017-05-03 09:44:54 +02:00
Ines Montani
3ea23a3f4d
Fix formatting
2017-05-03 09:44:38 +02:00
Ines Montani
d730eb0c0d
Raise custom ImportError if importing janome fails
2017-05-03 09:43:29 +02:00
Ines Montani
949ad6594b
Add newline
2017-05-03 09:38:43 +02:00
Ines Montani
d12ca587ea
Add newline
2017-05-03 09:38:29 +02:00
Ines Montani
8676cd0135
Add newline
2017-05-03 09:38:07 +02:00
Yasuaki Uechi
0e7a9b9fac
Add Japanese to 'Alpha support’ section
2017-05-03 13:56:45 +09:00
Yasuaki Uechi
c8f83aeb87
Add basic japanese support
2017-05-03 13:56:21 +09:00
Ines Montani
f26a3b5a50
Merge pull request #1025 from Ferdous-Al-Imran/master
2017-04-27 14:36:37 +02:00
Ines Montani
fb96f88b59
Update info on CoNLL format and include link
2017-04-27 14:36:08 +02:00
Matthew Honnibal
31ec9e1371
Merge branch 'master' of https://github.com/explosion/spaCy
2017-04-27 13:21:39 +02:00
Matthew Honnibal
2da16adcc2
Add dropout optin for parser and NER
...
Dropout can now be specified in the `Parser.update()` method via
the `drop` keyword argument, e.g.
nlp.entity.update(doc, gold, drop=0.4)
This will randomly drop 40% of features, and multiply the value of the
others by 1. / 0.4. This may be useful for generalising from small data
sets.
This commit also patches the examples/training/train_new_entity_type.py
example, to use dropout and fix the output (previously it did not output
the learned entity).
2017-04-27 13:18:39 +02:00
M. Z. Ferdous (Imran)
c9f9203d5f
fix typo, CONLL format
...
tried to google about connlu format. Saw there is conll format, not connlu.
2017-04-27 16:48:54 +06:00
ines
5aa49971f9
Add French example to models docs
2017-04-27 12:08:47 +02:00
Ines Montani
7a894c9ef0
Update README.rst
2017-04-27 11:25:30 +02:00
ines
034ec5710b
Fix typo and add Norwegian to alpha languages
2017-04-27 11:24:21 +02:00
Ines Montani
2f918e3004
Update README.rst
2017-04-27 11:18:41 +02:00
Ines Montani
bc88f9865e
Remove file (already covered in PR)
2017-04-27 11:17:30 +02:00
Ines Montani
6930ed719d
Update CONTRIBUTORS.md
2017-04-27 11:17:06 +02:00
Ines Montani
7da9cefd25
Merge pull request #1022 from luvogels/master
...
Initial support for Norwegian Bokmål
2017-04-27 11:16:06 +02:00
Ines Montani
c9e592ae6c
Add newline
2017-04-27 11:15:41 +02:00
Ines Montani
5942adccc2
Add newline
2017-04-27 11:15:19 +02:00
Ines Montani
4cd9269aef
Add newline
2017-04-27 11:15:04 +02:00
Ines Montani
ccf13ecc21
Add newline
2017-04-27 11:14:42 +02:00
Ines Montani
03d2b0cc05
Add newline
2017-04-27 11:14:26 +02:00
Leif Uwe Vogelsang
13ce4c96b1
Update luvogels.md
2017-04-27 10:42:07 +02:00
Leif Uwe Vogelsang
e136c51393
Update Alpha_support_Norwegian bokmål.md
2017-04-26 23:24:11 +02:00
luvogels
d12a0b6431
Hooked up tokenizer tests
2017-04-26 23:21:41 +02:00
ines
100846bed3
Fix typo in model list
2017-04-26 21:40:17 +02:00
ines
05bcd61fcf
Update README.rst
2017-04-26 20:51:38 +02:00
ines
375edf0bb5
Add list of models and include French
2017-04-26 20:50:27 +02:00
ines
4eacd72bc3
Move list of models to own file
2017-04-26 20:50:27 +02:00
Matthew Honnibal
f0e1606d27
Increment version
2017-04-26 20:25:41 +02:00
luvogels
b331929a7e
Merge branch 'master' of https://github.com/luvogels/spaCy
2017-04-26 19:15:48 +02:00
luvogels
8de59ce3b9
Added tokenizer tests
2017-04-26 19:10:18 +02:00
Matthew Honnibal
4d98511db7
Make Span hashable. Closes #1019
2017-04-26 19:01:05 +02:00
Matthew Honnibal
24c4c51f13
Try to make test999 less flakey
2017-04-26 18:42:06 +02:00
Leif Uwe Vogelsang
460094bf09
Update __init__.py
2017-04-26 18:27:55 +02:00
luvogels
cbfe4920bb
Added contributor agreement and pull request doc
2017-04-26 18:02:34 +02:00
ines
527d51ac9a
Fetch shortcuts from GitHub and improve error handling
2017-04-26 18:00:28 +02:00
ines
c2006166d3
Update list of available models and info
2017-04-26 16:03:41 +02:00
ines
5a470367df
Add mixin for model row in model docs
2017-04-26 16:03:17 +02:00
ines
5d598b6747
Add star icon
2017-04-26 16:03:05 +02:00
ines
6c4f3c6fc2
Allow styles arguments on row mixin
2017-04-26 16:02:59 +02:00