Eric Zhao
aafdf6ffb8
Add option to use label karg to determine ent_type in doc.merge
2017-03-28 23:35:03 -07:00
Em
9c809efc25
Removed mapStr
2017-03-11 16:23:26 -08:00
Em
1bb364a3b5
Adding venv to .gitignore
2017-03-10 16:52:04 -08:00
Em
426d17167f
Added string manipulation for spans
2017-03-10 16:50:02 -08:00
Ines Montani
a16aff17aa
Merge pull request #876 from PySUST/master
...
[Bangla] Update "tokenizer_exceptions.py"
2017-03-10 14:46:00 +01:00
ines
10e29189ac
Adjust URL testcases and xfail problems (instead of comment)
2017-03-10 14:22:50 +01:00
ines
b04893a059
Make regex locale-independent for Python 2
2017-03-10 14:21:57 +01:00
Ines Montani
9019658b40
Update CONTRIBUTORS.md
2017-03-10 13:37:41 +01:00
Ines Montani
1c40890321
Add missing comma
...
Should fix Travis build error
2017-03-10 09:34:54 +01:00
Shuvanon Razik
c251703428
Update abbreviations
2017-03-10 10:45:01 +06:00
Matthew Honnibal
dd13aacc09
Merge pull request #879 from rappdw/rappdw/tokenizer_exceptions_url_fix
...
Fix for Issue #840 - URL pattern too broad
2017-03-09 20:43:11 +01:00
Dan Rapp
123d3f2d38
Fix error in test case parameterization
2017-03-09 12:18:21 -07:00
Dan Rapp
b9307dfcd7
Merge branch 'master' into rappdw/tokenizer_exceptions_url_fix
2017-03-09 11:42:14 -07:00
Dan Rapp
3b1df3808d
Issue #840 - URL pattenr too broad
2017-03-09 11:39:39 -07:00
shuvanon
85438aee1b
update tokenizertokenizer
2017-03-08 17:29:39 +06:00
shuvanon
45bc78461c
update tokenizertokenizer
2017-03-08 17:27:12 +06:00
ines
dc32e3ecb3
Fix link
2017-03-08 11:37:04 +01:00
ines
758335452d
Update installation instructions and fix formatting
2017-03-08 11:36:00 +01:00
Ines Montani
34801a0725
Update README.rst
2017-03-08 11:08:09 +01:00
ines
004c4c9566
Update installation docs
...
Include conda and virtualenv info for pip, add instructions for
downloading models manually and add details and fab commands to
"Compile from source" section.
2017-03-07 18:52:22 +01:00
Ines Montani
57d70ea3e1
Update README.rst
2017-03-07 17:59:30 +01:00
Matthew Honnibal
3a5f726208
Merge pull request #874 from badbye/patch-1
...
**Documentation**: Edit example code
2017-03-07 15:31:28 +01:00
yalei
27c0e6226b
Edit example code
...
The original code forget to import the `random` module and the `EntityRecognizer` module.
2017-03-07 18:07:40 +08:00
Ines Montani
f710fc3f2d
Merge pull request #873 from banglakit/bn-tests
...
Add tests for Bengali
2017-03-05 12:13:49 +01:00
Aniruddha Adhikary
696215a3fb
add tests for Bengali
2017-03-05 11:25:12 +06:00
Ines Montani
3c1411226d
Update CONTRIBUTORS.md
2017-03-04 12:31:51 +01:00
Ines Montani
bb959692f5
Merge pull request #872 from banglakit/bn-improvements
...
[Bengali] basic tag map, morph, lemma rules and exceptions
2017-03-04 11:36:24 +01:00
Aniruddha Adhikary
8f3bfe9bfc
[Bengali] basic tag map, morph, lemma rules and exceptions
2017-03-04 12:36:59 +06:00
Ines Montani
33efe77392
Update badges and add info about conda (see #778 )
2017-03-03 19:15:56 +01:00
ines
8dff040032
Revert "Add regression test for #859 "
...
This reverts commit c4f16c66d1
.
2017-03-01 21:56:20 +01:00
ines
c4f16c66d1
Add regression test for #859
2017-03-01 16:07:27 +01:00
ines
d25f17f139
Add Bengali to list of languages (see #865 )
2017-03-01 15:59:21 +01:00
Matthew Honnibal
0f74002a26
Merge pull request #865 from banglakit/bn
...
Add basic Bengali language support
2017-03-01 15:25:58 +01:00
Aniruddha Adhikary
d91be7aed4
add punctuations for Bengali
2017-02-28 21:07:14 +06:00
Aniruddha Adhikary
5a4fc09576
add basic Bengali support
2017-02-28 07:48:37 +06:00
Matthew Honnibal
cc9b2b74e3
Merge branch 'french-tokenizer-exceptions'
2017-02-27 11:44:39 +01:00
Matthew Honnibal
bd4375a2e6
Remove comment
2017-02-27 11:44:26 +01:00
Matthew Honnibal
e7e22d8be6
Move import within get_exceptions() function, to speed import
2017-02-27 11:34:48 +01:00
Matthew Honnibal
34bcc8706d
Merge branch 'french-tokenizer-exceptions'
2017-02-27 11:21:21 +01:00
Matthew Honnibal
0aaa546435
Fix test after updating the French tokenizer stuff
2017-02-27 11:20:47 +01:00
Matthew Honnibal
26446aa728
Avoid loading all French exceptions on import
...
Move exceptions loading behind a get_tokenizer_exceptions() function
for French, instead of loading into the top-level namespace. This
cuts import times from 0.6s to 0.2s, at the expense of making the
French data a little different from the others (there's no top-level
TOKENIZER_EXCEPTIONS variable.) The current solution feels somewhat
unsatisfying.
2017-02-25 11:55:00 +01:00
Ines Montani
f81b985f7f
Update CONTRIBUTING.md
2017-02-24 20:07:05 +01:00
ines
2b07ab7db4
Add feature scheme to API docs (see #857 , #739 )
2017-02-24 18:26:32 +01:00
ines
376c5813a7
Remove print statements from test
2017-02-24 18:26:32 +01:00
ines
8ddad178f6
Add book and tutorial
2017-02-24 18:26:32 +01:00
ines
00728a23f0
Fix path in gitignore
2017-02-24 18:26:32 +01:00
ines
7c1260e98c
Add regression test
2017-02-24 18:22:49 +01:00
ines
0e2e331b58
Convert exceptions to Python list
2017-02-24 18:22:40 +01:00
ines
51eb190ef4
Remove print statements from test
2017-02-24 17:41:12 +01:00
ines
056b2466e3
Add book and tutorial
2017-02-24 17:39:27 +01:00