Ines Montani
|
e97a30b99a
|
Merge pull request #885 from PySUST/master
[Bengali] Spell checked and add new stop words
|
2017-03-12 13:20:59 +01:00 |
|
ines
|
66c1f194f9
|
Use consistent unicode declarations
|
2017-03-12 13:07:28 +01:00 |
|
shuvanon
|
91cb4cdb2b
|
Sort stop_words
|
2017-03-12 17:55:51 +06:00 |
|
shuvanon
|
784f6cfa49
|
Update stop_words
|
2017-03-12 17:41:01 +06:00 |
|
shuvanon
|
73cc17078e
|
Merge branch 'master' of https://github.com/PySUST/spaCy
|
2017-03-12 14:52:17 +06:00 |
|
shuvanon
|
35ec7135bb
|
Spell checked and add new stop words
|
2017-03-12 14:51:34 +06:00 |
|
Matthew Honnibal
|
fa23278ee3
|
Add classes for beam parser and beam NER
|
2017-03-11 12:45:37 -06:00 |
|
Matthew Honnibal
|
6c4108c073
|
Add header for beam parser
|
2017-03-11 12:45:12 -06:00 |
|
Matthew Honnibal
|
4382f175b3
|
Squelch compiler warnings
|
2017-03-11 12:44:43 -06:00 |
|
Matthew Honnibal
|
ea2592879f
|
Merge branch 'master' of https://github.com/explosion/spaCy
|
2017-03-11 11:13:37 -06:00 |
|
Matthew Honnibal
|
1224c4d3c6
|
Improve output on trainer
|
2017-03-11 11:12:48 -06:00 |
|
Matthew Honnibal
|
b438dfd3f3
|
Add itn argument to tagger.update
|
2017-03-11 11:12:21 -06:00 |
|
Matthew Honnibal
|
931feb3360
|
Allow beam parsing for NER
|
2017-03-11 11:12:01 -06:00 |
|
Matthew Honnibal
|
f77a5bb60a
|
Switch back to greedy parser
|
2017-03-11 11:11:30 -06:00 |
|
Matthew Honnibal
|
ca9c8c57c0
|
Add iteration argument to parser.update
|
2017-03-11 07:00:47 -06:00 |
|
Matthew Honnibal
|
dcce9ca3f3
|
Use beam parser
|
2017-03-11 07:00:20 -06:00 |
|
Matthew Honnibal
|
e30ffdd003
|
Use ftrl optimizer in tagger
|
2017-03-11 06:59:13 -06:00 |
|
Matthew Honnibal
|
d59c6926c1
|
I think this fixes the segfault
|
2017-03-11 06:58:34 -06:00 |
|
Matthew Honnibal
|
318b9e32ff
|
WIP on beam parser. Currently segfaults.
|
2017-03-11 06:19:52 -06:00 |
|
Matthew Honnibal
|
b0d80dc9ae
|
Update name of 'train' function in BeamParser
|
2017-03-10 14:35:43 -06:00 |
|
Matthew Honnibal
|
d11f1a4ddf
|
Record negative costs in non-monotonic arc eager oracle
|
2017-03-10 11:22:04 -06:00 |
|
Matthew Honnibal
|
ecf91a2dbb
|
Support beam parser
|
2017-03-10 11:21:21 -06:00 |
|
Ines Montani
|
a16aff17aa
|
Merge pull request #876 from PySUST/master
[Bangla] Update "tokenizer_exceptions.py"
|
2017-03-10 14:46:00 +01:00 |
|
ines
|
10e29189ac
|
Adjust URL testcases and xfail problems (instead of comment)
|
2017-03-10 14:22:50 +01:00 |
|
ines
|
b04893a059
|
Make regex locale-independent for Python 2
|
2017-03-10 14:21:57 +01:00 |
|
Matthew Honnibal
|
ea53647362
|
Merge branch 'develop'
|
2017-03-10 02:49:39 -06:00 |
|
Ines Montani
|
1c40890321
|
Add missing comma
Should fix Travis build error
|
2017-03-10 09:34:54 +01:00 |
|
Shuvanon Razik
|
c251703428
|
Update abbreviations
|
2017-03-10 10:45:01 +06:00 |
|
Matthew Honnibal
|
b5247c49eb
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-03-09 18:45:43 -06:00 |
|
Matthew Honnibal
|
798450136d
|
Set L1 penalty to 0 in tagger.
|
2017-03-09 18:43:47 -06:00 |
|
Matthew Honnibal
|
c62da02344
|
Use ftrl training, to learn compressed model.
|
2017-03-09 18:43:21 -06:00 |
|
Matthew Honnibal
|
f71eeef9bb
|
Pass path argument to end_training
|
2017-03-09 18:42:40 -06:00 |
|
Dan Rapp
|
123d3f2d38
|
Fix error in test case parameterization
|
2017-03-09 12:18:21 -07:00 |
|
Dan Rapp
|
b9307dfcd7
|
Merge branch 'master' into rappdw/tokenizer_exceptions_url_fix
|
2017-03-09 11:42:14 -07:00 |
|
Dan Rapp
|
3b1df3808d
|
Issue #840 - URL pattenr too broad
|
2017-03-09 11:39:39 -07:00 |
|
Matthew Honnibal
|
5b0b968d13
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-03-08 15:03:10 +01:00 |
|
Matthew Honnibal
|
0ac3d27689
|
Fix handling of trailing whitespace
Fix off-by-one error that meant trailing spaces were being dropped.
Closes #792
|
2017-03-08 15:01:40 +01:00 |
|
ines
|
c2e3e651b8
|
Re-add regression test for #859
|
2017-03-08 14:36:09 +01:00 |
|
Matthew Honnibal
|
0a6d7ca200
|
Fix spacing after token_match
The boolean flag indicating a space after the token was
being set incorrectly after the token_match regex was applied.
Fixes #859.
|
2017-03-08 14:33:32 +01:00 |
|
shuvanon
|
85438aee1b
|
update tokenizertokenizer
|
2017-03-08 17:29:39 +06:00 |
|
shuvanon
|
45bc78461c
|
update tokenizertokenizer
|
2017-03-08 17:27:12 +06:00 |
|
Matthew Honnibal
|
cd33b39a04
|
Fix 2/3 problem for json save/load
|
2017-03-08 01:39:13 +01:00 |
|
Matthew Honnibal
|
40703988bc
|
Use FTRL training in parser
|
2017-03-08 01:38:51 +01:00 |
|
Matthew Honnibal
|
d108534dc2
|
Fix 2/3 problems for training
|
2017-03-08 01:37:52 +01:00 |
|
Matthew Honnibal
|
d03d6a13f1
|
Merge branch 'rominf-ud20' into develop
|
2017-03-07 21:48:56 +01:00 |
|
Matthew Honnibal
|
f7374d0b86
|
Merge branch 'ud20' of https://github.com/rominf/spaCy into rominf-ud20
|
2017-03-07 21:48:37 +01:00 |
|
Matthew Honnibal
|
16670d3251
|
Xfail the vocab pickling for now
|
2017-03-07 21:43:28 +01:00 |
|
Matthew Honnibal
|
a89c3500f6
|
Fixes to hacky vocab pickling
|
2017-03-07 20:58:55 +01:00 |
|
Matthew Honnibal
|
d814892805
|
Hackish pickle support for Vocab.
|
2017-03-07 20:25:12 +01:00 |
|
Matthew Honnibal
|
26614e028f
|
Add hacky support for StringCFile, to make pickling easier.
|
2017-03-07 20:24:37 +01:00 |
|