Commit Graph

5638 Commits

Author SHA1 Message Date
Gregory Howard
0e8c41ea4f Adding method lemmatizer for every class 2017-05-03 12:14:42 +02:00
Gregory Howard
32ca07989e adding export japanese 2017-05-03 11:07:29 +02:00
Grégory Howard
2e10bc6d8c Merge branch 'master' into master 2017-05-03 11:05:30 +02:00
Grégory Howard
f9d7144224 Merge branch 'master' into master 2017-05-03 11:04:51 +02:00
Gregory Howard
f2ab7d77b4 Lazy imports language 2017-05-03 11:01:42 +02:00
Ines Montani
6e1fad92a1 Update CONTRIBUTORS.md 2017-05-03 10:01:40 +02:00
ines
e2380d8789 Update README.rst 2017-05-03 10:00:04 +02:00
ines
f9384b0fbd Update alpha languages and add aside for tokenizer dependencies 2017-05-03 09:58:31 +02:00
Ines Montani
f0d7a87e18 Merge pull request #1035 from uetchy/japanese-support
Japanese support
2017-05-03 09:44:54 +02:00
Ines Montani
3ea23a3f4d Fix formatting 2017-05-03 09:44:38 +02:00
Ines Montani
d730eb0c0d Raise custom ImportError if importing janome fails 2017-05-03 09:43:29 +02:00
Ines Montani
949ad6594b Add newline 2017-05-03 09:38:43 +02:00
Ines Montani
d12ca587ea Add newline 2017-05-03 09:38:29 +02:00
Ines Montani
8676cd0135 Add newline 2017-05-03 09:38:07 +02:00
Yasuaki Uechi
0e7a9b9fac Add Japanese to 'Alpha support’ section 2017-05-03 13:56:45 +09:00
Yasuaki Uechi
c8f83aeb87 Add basic japanese support 2017-05-03 13:56:21 +09:00
Gregory Howard
c0afcd22bb Merge remote-tracking branch 'remotes/upstream/master' 2017-04-27 14:42:54 +02:00
Ines Montani
f26a3b5a50 Merge pull request #1025 from Ferdous-Al-Imran/master 2017-04-27 14:36:37 +02:00
Ines Montani
fb96f88b59 Update info on CoNLL format and include link 2017-04-27 14:36:08 +02:00
Matthew Honnibal
31ec9e1371 Merge branch 'master' of https://github.com/explosion/spaCy 2017-04-27 13:21:39 +02:00
Matthew Honnibal
2da16adcc2 Add dropout optin for parser and NER
Dropout can now be specified in the `Parser.update()` method via
the `drop` keyword argument, e.g.

    nlp.entity.update(doc, gold, drop=0.4)

This will randomly drop 40% of features, and multiply the value of the
others by 1. / 0.4. This may be useful for generalising from small data
sets.

This commit also patches the examples/training/train_new_entity_type.py
example, to use dropout and fix the output (previously it did not output
the learned entity).
2017-04-27 13:18:39 +02:00
M. Z. Ferdous (Imran)
c9f9203d5f fix typo, CONLL format
tried to google about connlu format. Saw there is conll format, not connlu.
2017-04-27 16:48:54 +06:00
ines
5aa49971f9 Add French example to models docs 2017-04-27 12:08:47 +02:00
Gregory Howard
92f368f83b Removing extra spaces 2017-04-27 12:02:14 +02:00
Gregory Howard
13b6957c8e Adding unitest for tokenization in french (with title) 2017-04-27 11:53:44 +02:00
Gregory Howard
8ff4682255 correcting tokenizer exception.
Adding tests for lemmatization
2017-04-27 11:52:14 +02:00
Ines Montani
7a894c9ef0 Update README.rst 2017-04-27 11:25:30 +02:00
ines
034ec5710b Fix typo and add Norwegian to alpha languages 2017-04-27 11:24:21 +02:00
Ines Montani
2f918e3004 Update README.rst 2017-04-27 11:18:41 +02:00
Ines Montani
bc88f9865e Remove file (already covered in PR) 2017-04-27 11:17:30 +02:00
Ines Montani
6930ed719d Update CONTRIBUTORS.md 2017-04-27 11:17:06 +02:00
Ines Montani
7da9cefd25 Merge pull request #1022 from luvogels/master
Initial support for Norwegian Bokmål
2017-04-27 11:16:06 +02:00
Ines Montani
c9e592ae6c Add newline 2017-04-27 11:15:41 +02:00
Ines Montani
5942adccc2 Add newline 2017-04-27 11:15:19 +02:00
Ines Montani
4cd9269aef Add newline 2017-04-27 11:15:04 +02:00
Ines Montani
ccf13ecc21 Add newline 2017-04-27 11:14:42 +02:00
Ines Montani
03d2b0cc05 Add newline 2017-04-27 11:14:26 +02:00
Gregory Howard
44cb486849 Adding unitest for tokenization in french (with title) 2017-04-27 10:59:38 +02:00
Leif Uwe Vogelsang
13ce4c96b1 Update luvogels.md 2017-04-27 10:42:07 +02:00
Gregory Howard
ad8129cb45 Improvement of rules now title insentive and have same declaration format 2017-04-27 10:23:56 +02:00
Leif Uwe Vogelsang
e136c51393 Update Alpha_support_Norwegian bokmål.md 2017-04-26 23:24:11 +02:00
luvogels
d12a0b6431 Hooked up tokenizer tests 2017-04-26 23:21:41 +02:00
ines
100846bed3 Fix typo in model list 2017-04-26 21:40:17 +02:00
ines
05bcd61fcf Update README.rst 2017-04-26 20:51:38 +02:00
ines
375edf0bb5 Add list of models and include French 2017-04-26 20:50:27 +02:00
ines
4eacd72bc3 Move list of models to own file 2017-04-26 20:50:27 +02:00
Matthew Honnibal
f0e1606d27 Increment version 2017-04-26 20:25:41 +02:00
luvogels
b331929a7e Merge branch 'master' of https://github.com/luvogels/spaCy 2017-04-26 19:15:48 +02:00
luvogels
8de59ce3b9 Added tokenizer tests 2017-04-26 19:10:18 +02:00
Matthew Honnibal
4d98511db7 Make Span hashable. Closes #1019 2017-04-26 19:01:05 +02:00