ines
|
1da29a7146
|
Use new Lemmatizer data and remove file import
Since there's currently only an English lemmatizer, the global
Lemmatizer imports from spacy.en. This is unideal and still needs to be
fixed.
|
2017-03-12 13:58:22 +01:00 |
|
ines
|
0957737ee8
|
Add Python-formatted lemmatizer data and rules
|
2017-03-12 13:58:22 +01:00 |
|
ines
|
c89e30d1a3
|
Add test for English time exceptions ("1a.m." etc.)
|
2017-03-12 13:58:22 +01:00 |
|
ines
|
ce9568af84
|
Move English time exceptions ("1a.m." etc.) and refactor
|
2017-03-12 13:58:22 +01:00 |
|
ines
|
6b30541774
|
Fix formatting
|
2017-03-12 13:58:22 +01:00 |
|
Ines Montani
|
e9524b7647
|
Update CONTRIBUTORS.md
|
2017-03-12 13:22:30 +01:00 |
|
Ines Montani
|
e97a30b99a
|
Merge pull request #885 from PySUST/master
[Bengali] Spell checked and add new stop words
|
2017-03-12 13:20:59 +01:00 |
|
ines
|
66c1f194f9
|
Use consistent unicode declarations
|
2017-03-12 13:07:28 +01:00 |
|
shuvanon
|
91cb4cdb2b
|
Sort stop_words
|
2017-03-12 17:55:51 +06:00 |
|
shuvanon
|
784f6cfa49
|
Update stop_words
|
2017-03-12 17:41:01 +06:00 |
|
shuvanon
|
8a2d22222d
|
filled up CONTRIBUTOR_AGREEMENT.md
|
2017-03-12 17:07:55 +06:00 |
|
shuvanon
|
73cc17078e
|
Merge branch 'master' of https://github.com/PySUST/spaCy
|
2017-03-12 14:52:17 +06:00 |
|
shuvanon
|
35ec7135bb
|
Spell checked and add new stop words
|
2017-03-12 14:51:34 +06:00 |
|
Em
|
9c809efc25
|
Removed mapStr
|
2017-03-11 16:23:26 -08:00 |
|
Matthew Honnibal
|
fa23278ee3
|
Add classes for beam parser and beam NER
|
2017-03-11 12:45:37 -06:00 |
|
Matthew Honnibal
|
cb39b6e337
|
Require recent thinc
|
2017-03-11 12:45:22 -06:00 |
|
Matthew Honnibal
|
6c4108c073
|
Add header for beam parser
|
2017-03-11 12:45:12 -06:00 |
|
Matthew Honnibal
|
4382f175b3
|
Squelch compiler warnings
|
2017-03-11 12:44:43 -06:00 |
|
Matthew Honnibal
|
93ab888d1d
|
Require recent preshed
|
2017-03-11 12:33:56 -06:00 |
|
Matthew Honnibal
|
ea2592879f
|
Merge branch 'master' of https://github.com/explosion/spaCy
|
2017-03-11 11:13:37 -06:00 |
|
Matthew Honnibal
|
1224c4d3c6
|
Improve output on trainer
|
2017-03-11 11:12:48 -06:00 |
|
Matthew Honnibal
|
b438dfd3f3
|
Add itn argument to tagger.update
|
2017-03-11 11:12:21 -06:00 |
|
Matthew Honnibal
|
931feb3360
|
Allow beam parsing for NER
|
2017-03-11 11:12:01 -06:00 |
|
Matthew Honnibal
|
f77a5bb60a
|
Switch back to greedy parser
|
2017-03-11 11:11:30 -06:00 |
|
Matthew Honnibal
|
a155482fda
|
Improve printing in train_ud script
|
2017-03-11 11:11:05 -06:00 |
|
Ines Montani
|
dae0701bbd
|
Fix typo
|
2017-03-11 16:43:51 +01:00 |
|
Matthew Honnibal
|
ca9c8c57c0
|
Add iteration argument to parser.update
|
2017-03-11 07:00:47 -06:00 |
|
Matthew Honnibal
|
dcce9ca3f3
|
Use beam parser
|
2017-03-11 07:00:20 -06:00 |
|
Matthew Honnibal
|
e30ffdd003
|
Use ftrl optimizer in tagger
|
2017-03-11 06:59:13 -06:00 |
|
Matthew Honnibal
|
d59c6926c1
|
I think this fixes the segfault
|
2017-03-11 06:58:34 -06:00 |
|
Matthew Honnibal
|
318b9e32ff
|
WIP on beam parser. Currently segfaults.
|
2017-03-11 06:19:52 -06:00 |
|
Em
|
1bb364a3b5
|
Adding venv to .gitignore
|
2017-03-10 16:52:04 -08:00 |
|
Em
|
426d17167f
|
Added string manipulation for spans
|
2017-03-10 16:50:02 -08:00 |
|
Matthew Honnibal
|
b0d80dc9ae
|
Update name of 'train' function in BeamParser
|
2017-03-10 14:35:43 -06:00 |
|
Matthew Honnibal
|
0ed2afde89
|
Compile beam parser
|
2017-03-10 11:22:22 -06:00 |
|
Matthew Honnibal
|
d11f1a4ddf
|
Record negative costs in non-monotonic arc eager oracle
|
2017-03-10 11:22:04 -06:00 |
|
Matthew Honnibal
|
ecf91a2dbb
|
Support beam parser
|
2017-03-10 11:21:21 -06:00 |
|
Ines Montani
|
a16aff17aa
|
Merge pull request #876 from PySUST/master
[Bangla] Update "tokenizer_exceptions.py"
|
2017-03-10 14:46:00 +01:00 |
|
ines
|
10e29189ac
|
Adjust URL testcases and xfail problems (instead of comment)
|
2017-03-10 14:22:50 +01:00 |
|
ines
|
b04893a059
|
Make regex locale-independent for Python 2
|
2017-03-10 14:21:57 +01:00 |
|
Ines Montani
|
9019658b40
|
Update CONTRIBUTORS.md
|
2017-03-10 13:37:41 +01:00 |
|
Matthew Honnibal
|
ea53647362
|
Merge branch 'develop'
|
2017-03-10 02:49:39 -06:00 |
|
Ines Montani
|
1c40890321
|
Add missing comma
Should fix Travis build error
|
2017-03-10 09:34:54 +01:00 |
|
Shuvanon Razik
|
c251703428
|
Update abbreviations
|
2017-03-10 10:45:01 +06:00 |
|
Matthew Honnibal
|
b5247c49eb
|
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
|
2017-03-09 18:45:43 -06:00 |
|
Matthew Honnibal
|
35124b144a
|
Add L1 penalty option to parser
|
2017-03-09 18:44:53 -06:00 |
|
Matthew Honnibal
|
798450136d
|
Set L1 penalty to 0 in tagger.
|
2017-03-09 18:43:47 -06:00 |
|
Matthew Honnibal
|
c62da02344
|
Use ftrl training, to learn compressed model.
|
2017-03-09 18:43:21 -06:00 |
|
Matthew Honnibal
|
f71eeef9bb
|
Pass path argument to end_training
|
2017-03-09 18:42:40 -06:00 |
|
Matthew Honnibal
|
dd13aacc09
|
Merge pull request #879 from rappdw/rappdw/tokenizer_exceptions_url_fix
Fix for Issue #840 - URL pattern too broad
|
2017-03-09 20:43:11 +01:00 |
|