shuvanon
45bc78461c
update tokenizertokenizer
2017-03-08 17:27:12 +06:00
Ines Montani
34801a0725
Update README.rst
2017-03-08 11:08:09 +01:00
ines
004c4c9566
Update installation docs
...
Include conda and virtualenv info for pip, add instructions for
downloading models manually and add details and fab commands to
"Compile from source" section.
2017-03-07 18:52:22 +01:00
Ines Montani
57d70ea3e1
Update README.rst
2017-03-07 17:59:30 +01:00
Matthew Honnibal
3a5f726208
Merge pull request #874 from badbye/patch-1
...
**Documentation**: Edit example code
2017-03-07 15:31:28 +01:00
yalei
27c0e6226b
Edit example code
...
The original code forget to import the `random` module and the `EntityRecognizer` module.
2017-03-07 18:07:40 +08:00
Ines Montani
f710fc3f2d
Merge pull request #873 from banglakit/bn-tests
...
Add tests for Bengali
2017-03-05 12:13:49 +01:00
Aniruddha Adhikary
696215a3fb
add tests for Bengali
2017-03-05 11:25:12 +06:00
Ines Montani
3c1411226d
Update CONTRIBUTORS.md
2017-03-04 12:31:51 +01:00
Ines Montani
bb959692f5
Merge pull request #872 from banglakit/bn-improvements
...
[Bengali] basic tag map, morph, lemma rules and exceptions
2017-03-04 11:36:24 +01:00
Aniruddha Adhikary
8f3bfe9bfc
[Bengali] basic tag map, morph, lemma rules and exceptions
2017-03-04 12:36:59 +06:00
Ines Montani
33efe77392
Update badges and add info about conda (see #778 )
2017-03-03 19:15:56 +01:00
ines
8dff040032
Revert "Add regression test for #859 "
...
This reverts commit c4f16c66d1
.
2017-03-01 21:56:20 +01:00
ines
c4f16c66d1
Add regression test for #859
2017-03-01 16:07:27 +01:00
ines
d25f17f139
Add Bengali to list of languages (see #865 )
2017-03-01 15:59:21 +01:00
Matthew Honnibal
0f74002a26
Merge pull request #865 from banglakit/bn
...
Add basic Bengali language support
2017-03-01 15:25:58 +01:00
Aniruddha Adhikary
d91be7aed4
add punctuations for Bengali
2017-02-28 21:07:14 +06:00
Aniruddha Adhikary
5a4fc09576
add basic Bengali support
2017-02-28 07:48:37 +06:00
Matthew Honnibal
cc9b2b74e3
Merge branch 'french-tokenizer-exceptions'
2017-02-27 11:44:39 +01:00
Matthew Honnibal
bd4375a2e6
Remove comment
2017-02-27 11:44:26 +01:00
Matthew Honnibal
e7e22d8be6
Move import within get_exceptions() function, to speed import
2017-02-27 11:34:48 +01:00
Matthew Honnibal
34bcc8706d
Merge branch 'french-tokenizer-exceptions'
2017-02-27 11:21:21 +01:00
Matthew Honnibal
0aaa546435
Fix test after updating the French tokenizer stuff
2017-02-27 11:20:47 +01:00
Matthew Honnibal
26446aa728
Avoid loading all French exceptions on import
...
Move exceptions loading behind a get_tokenizer_exceptions() function
for French, instead of loading into the top-level namespace. This
cuts import times from 0.6s to 0.2s, at the expense of making the
French data a little different from the others (there's no top-level
TOKENIZER_EXCEPTIONS variable.) The current solution feels somewhat
unsatisfying.
2017-02-25 11:55:00 +01:00
Ines Montani
f81b985f7f
Update CONTRIBUTING.md
2017-02-24 20:07:05 +01:00
ines
2b07ab7db4
Add feature scheme to API docs (see #857 , #739 )
2017-02-24 18:26:32 +01:00
ines
376c5813a7
Remove print statements from test
2017-02-24 18:26:32 +01:00
ines
8ddad178f6
Add book and tutorial
2017-02-24 18:26:32 +01:00
ines
00728a23f0
Fix path in gitignore
2017-02-24 18:26:32 +01:00
ines
7c1260e98c
Add regression test
2017-02-24 18:22:49 +01:00
ines
0e2e331b58
Convert exceptions to Python list
2017-02-24 18:22:40 +01:00
ines
51eb190ef4
Remove print statements from test
2017-02-24 17:41:12 +01:00
ines
056b2466e3
Add book and tutorial
2017-02-24 17:39:27 +01:00
ines
52aebfa06f
Fix path in gitignore
2017-02-24 17:39:02 +01:00
Matthew Honnibal
db5ada3995
Merge branch 'master' of https://github.com/explosion/spaCy
2017-02-24 14:28:12 +01:00
Matthew Honnibal
8f94897d07
Add 1 operator to matcher, and make sure open patterns are closed at end of document. Closes Issue #766
2017-02-24 14:27:02 +01:00
Ines Montani
29adbef095
Update CONTRIBUTING.md
2017-02-18 14:34:03 +01:00
ines
67991b6e5f
Add more test cases to #775 regression test to cover #847
2017-02-18 14:10:44 +01:00
ines
30ce2a6793
Exclude "shed" and "Shed" from tokenizer exceptions (see #847 )
2017-02-18 14:10:44 +01:00
Ines Montani
9c04d97e22
Update CONTRIBUTING.md
2017-02-18 12:47:41 +01:00
Ines Montani
a3a3796ecd
Update CONTRIBUTING.md
2017-02-18 12:42:35 +01:00
Ines Montani
936de72ffc
Update CONTRIBUTING.md
2017-02-18 12:42:11 +01:00
Matthew Honnibal
f028f8ad28
Remove unfinished examples
2017-02-18 11:04:41 +01:00
Matthew Honnibal
c031c677cc
Remove unused model_dir option
...
As noted in #845 , the `model_dir` argument was not being used. I've removed it for now, although it would be good to have this option restored and working.
2017-02-18 10:38:22 +01:00
Ines Montani
724e51ed47
Update CONTRIBUTING.md
2017-02-17 14:07:48 +01:00
Ines Montani
de997c1a33
Merge pull request #842 from magnusburton/master
...
Added regular verb rules for Swedish
2017-02-17 11:18:20 +01:00
Magnus Burton
41fcfd06b8
Added regular verb rules for Swedish
2017-02-17 10:04:04 +01:00
ines
aa92d4e9b5
Fix unicode regex for Python 2 (see #834 )
2017-02-16 23:49:54 +01:00
ines
44de3c7642
Reformat test and use text_file fixture
2017-02-16 23:49:19 +01:00
Ines Montani
49a102aff3
Merge pull request #841 from jondoughty/patch-1
...
Updated Token class documentation
2017-02-16 23:47:51 +01:00