Gregory Howard
ed5f094451
Adding insensitive lemmatisation test
2017-04-25 18:07:02 +02:00
ghoward
26e31afc18
renamming tests
2017-04-25 17:46:01 +02:00
ghoward
c085c2d391
Adding some unitests
2017-04-25 17:44:16 +02:00
ghoward
55c6910f90
Look_up table for languages in spacy.
...
Need to find an another name for lemmatizerlookup. I was not inspired.
Trying to uses new files in fr language.
2017-04-24 16:39:00 +02:00
Matthew Honnibal
1b12f342e4
Merge branch 'master' of https://github.com/explosion/spaCy
2017-04-20 17:03:11 +02:00
Matthew Honnibal
4eef200bab
Persist the actions within spacy.parser.cfg
2017-04-20 17:02:44 +02:00
ines
25c70b4cc5
Move fix_text to spacy.compat (see #1002 )
2017-04-20 15:47:17 +02:00
Ines Montani
60b5243bee
Merge pull request #1002 from oroszgy/model_cli_fix
...
Fixes for the `model` CLI
2017-04-20 15:41:03 +02:00
Ines Montani
417f430d23
Relax version contstraint
2017-04-20 15:39:24 +02:00
Ines Montani
40a8f22ca7
Relax version contraint
2017-04-20 15:38:52 +02:00
Gyorgy Orosz
4a06a2572c
Using ftfy for handling broken encoded strings.
2017-04-20 13:34:51 +02:00
Ines Montani
1f785d25c6
Update CONTRIBUTORS.md
2017-04-20 12:28:05 +02:00
Ines Montani
3800b29046
Merge pull request #1001 from recognai/master
...
Add SPACE to es tag map
2017-04-20 12:16:34 +02:00
Ines Montani
df64e8dbb3
Merge pull request #996 from beneyal/master ( closes #995 )
...
Fix for issue 995
2017-04-20 12:16:02 +02:00
oeg
f0bcd0babb
fix(model): Add SPACE to es tag_map. Fixing error in morphology.pyx when SP tag is missing
2017-04-20 11:36:24 +02:00
Ben Eyal
e90e8a3f10
Enable test
2017-04-20 02:25:24 +03:00
Ben Eyal
33af52599e
Redefine alphabetic characters
...
For caseless languages (Hebrew, Bengali) all characters are both lowercase and uppercase.
2017-04-20 02:25:02 +03:00
Ben Eyal
d8098a8be2
Use regex
instead of re
2017-04-20 02:22:52 +03:00
oeg
daaa42dd25
Merge remote-tracking branch 'upstream/master'
2017-04-19 23:30:36 +02:00
oeg
936a297241
fix(model): Fix tag map for fixing issues with tag SPACE
2017-04-19 23:30:21 +02:00
ines
2bd89e7ade
Tidy up Hebrew tests and test for punctuation (see #995 )
2017-04-19 19:28:03 +02:00
Ines Montani
275fc9f78a
Update CONTRIBUTING.md
2017-04-19 12:09:10 +02:00
Matthew Honnibal
b763e9b66d
Add note about variable naming
2017-04-19 12:00:12 +02:00
ines
48da244058
Use spacy.compat.json_dumps for Python 2/3 compatibility ( resolves #991 )
2017-04-19 11:50:36 +02:00
Matthew Honnibal
0605b95f2e
Merge branch 'master' of https://github.com/explosion/spaCy
2017-04-18 13:48:00 +02:00
Matthew Honnibal
2f84626417
Fix train_new_entity_type example
2017-04-18 13:47:36 +02:00
ines
ddd5194088
Update Language docs and docstrings
2017-04-17 01:52:13 +02:00
ines
f62b740961
Use compat.json_dumps
2017-04-17 01:46:14 +02:00
ines
2ab394d655
Fix whitespace
2017-04-17 01:45:00 +02:00
ines
7f776258f0
Add link to API docs
2017-04-17 01:41:46 +02:00
ines
aad80a291f
Add save_to_directory method to API docs
2017-04-17 01:40:34 +02:00
ines
8e83f8e2fa
Update docstrings
2017-04-17 01:40:26 +02:00
ines
e2299dc389
Ensure path in save_to_directory
2017-04-17 01:40:14 +02:00
ines
01067e99d4
Merge branch 'develop'
2017-04-17 01:30:10 +02:00
ines
82f5f1f98f
Replace str with compat.unicode_
2017-04-17 01:29:54 +02:00
ines
ad74245be9
Merge branch 'master' into develop
2017-04-17 01:08:11 +02:00
ines
c6c3162c50
Fix lightning tour example ( closes #889 )
2017-04-17 00:00:30 +02:00
Ines Montani
e7ae3b7cc2
Fix formatting and typo ( closes #967 )
2017-04-16 23:56:12 +02:00
Ines Montani
734b0a4e4a
Update train_new_entity_type.py
2017-04-16 23:42:16 +02:00
ines
cffaf52152
Update README.rst
2017-04-16 23:34:14 +02:00
ines
db7e046faa
Update version
2017-04-16 23:23:59 +02:00
ines
02e7512b91
Increment version
2017-04-16 22:39:58 +02:00
ines
16a8521efa
Increment version
2017-04-16 22:38:38 +02:00
ines
de5062711b
Update adding languages workflow to reflect changes in __init__.py
2017-04-16 22:26:46 +02:00
Matthew Honnibal
4efd6fb9d6
Fix training
2017-04-16 15:28:27 -05:00
Matthew Honnibal
17c9fffb9e
Fix naked except
2017-04-16 15:28:16 -05:00
ines
5610fdcc06
Get language name first if no model path exists
...
Makes sure spaCy fails early if no tokenizer exists, and allows
printing better error message.
2017-04-16 22:16:47 +02:00
ines
ad168ba88c
Set model name to empty string if path override exists
...
Required for parse_package_meta, which composes path of data_path and
model_name (needs to be fixed in the future)
2017-04-16 22:15:51 +02:00
ines
97647c46cd
Add docstring and todo note
2017-04-16 22:14:45 +02:00
ines
5c5f8c0a72
Check if full string is found in lang classes first
...
This allows users to set arbitrary strings. (Otherwise, custom lang
class "my_custom_class" would always load Burmese "my" tokenizer if one
was available.)
2017-04-16 22:14:38 +02:00