Commit Graph

4939 Commits

Author SHA1 Message Date
Gregory Howard
ed5f094451 Adding insensitive lemmatisation test 2017-04-25 18:07:02 +02:00
ghoward
26e31afc18 renamming tests 2017-04-25 17:46:01 +02:00
ghoward
c085c2d391 Adding some unitests 2017-04-25 17:44:16 +02:00
ghoward
55c6910f90 Look_up table for languages in spacy.
Need to find an another name for lemmatizerlookup. I was not inspired.
Trying to uses new files in fr language.
2017-04-24 16:39:00 +02:00
Matthew Honnibal
1b12f342e4 Merge branch 'master' of https://github.com/explosion/spaCy 2017-04-20 17:03:11 +02:00
Matthew Honnibal
4eef200bab Persist the actions within spacy.parser.cfg 2017-04-20 17:02:44 +02:00
ines
25c70b4cc5 Move fix_text to spacy.compat (see #1002) 2017-04-20 15:47:17 +02:00
Ines Montani
60b5243bee Merge pull request #1002 from oroszgy/model_cli_fix
Fixes for the `model` CLI
2017-04-20 15:41:03 +02:00
Ines Montani
417f430d23 Relax version contstraint 2017-04-20 15:39:24 +02:00
Ines Montani
40a8f22ca7 Relax version contraint 2017-04-20 15:38:52 +02:00
Gyorgy Orosz
4a06a2572c Using ftfy for handling broken encoded strings. 2017-04-20 13:34:51 +02:00
Ines Montani
1f785d25c6 Update CONTRIBUTORS.md 2017-04-20 12:28:05 +02:00
Ines Montani
3800b29046 Merge pull request #1001 from recognai/master
Add SPACE to es tag map
2017-04-20 12:16:34 +02:00
Ines Montani
df64e8dbb3 Merge pull request #996 from beneyal/master (closes #995)
Fix for issue 995
2017-04-20 12:16:02 +02:00
oeg
f0bcd0babb fix(model): Add SPACE to es tag_map. Fixing error in morphology.pyx when SP tag is missing 2017-04-20 11:36:24 +02:00
Ben Eyal
e90e8a3f10 Enable test 2017-04-20 02:25:24 +03:00
Ben Eyal
33af52599e Redefine alphabetic characters
For caseless languages (Hebrew, Bengali) all characters are both lowercase and uppercase.
2017-04-20 02:25:02 +03:00
Ben Eyal
d8098a8be2 Use regex instead of re 2017-04-20 02:22:52 +03:00
oeg
daaa42dd25 Merge remote-tracking branch 'upstream/master' 2017-04-19 23:30:36 +02:00
oeg
936a297241 fix(model): Fix tag map for fixing issues with tag SPACE 2017-04-19 23:30:21 +02:00
ines
2bd89e7ade Tidy up Hebrew tests and test for punctuation (see #995) 2017-04-19 19:28:03 +02:00
Ines Montani
275fc9f78a Update CONTRIBUTING.md 2017-04-19 12:09:10 +02:00
Matthew Honnibal
b763e9b66d Add note about variable naming 2017-04-19 12:00:12 +02:00
ines
48da244058 Use spacy.compat.json_dumps for Python 2/3 compatibility (resolves #991) 2017-04-19 11:50:36 +02:00
Matthew Honnibal
0605b95f2e Merge branch 'master' of https://github.com/explosion/spaCy 2017-04-18 13:48:00 +02:00
Matthew Honnibal
2f84626417 Fix train_new_entity_type example 2017-04-18 13:47:36 +02:00
ines
ddd5194088 Update Language docs and docstrings 2017-04-17 01:52:13 +02:00
ines
f62b740961 Use compat.json_dumps 2017-04-17 01:46:14 +02:00
ines
2ab394d655 Fix whitespace 2017-04-17 01:45:00 +02:00
ines
7f776258f0 Add link to API docs 2017-04-17 01:41:46 +02:00
ines
aad80a291f Add save_to_directory method to API docs 2017-04-17 01:40:34 +02:00
ines
8e83f8e2fa Update docstrings 2017-04-17 01:40:26 +02:00
ines
e2299dc389 Ensure path in save_to_directory 2017-04-17 01:40:14 +02:00
ines
01067e99d4 Merge branch 'develop' 2017-04-17 01:30:10 +02:00
ines
82f5f1f98f Replace str with compat.unicode_ 2017-04-17 01:29:54 +02:00
ines
ad74245be9 Merge branch 'master' into develop 2017-04-17 01:08:11 +02:00
ines
c6c3162c50 Fix lightning tour example (closes #889) 2017-04-17 00:00:30 +02:00
Ines Montani
e7ae3b7cc2 Fix formatting and typo (closes #967) 2017-04-16 23:56:12 +02:00
Ines Montani
734b0a4e4a Update train_new_entity_type.py 2017-04-16 23:42:16 +02:00
ines
cffaf52152 Update README.rst 2017-04-16 23:34:14 +02:00
ines
db7e046faa Update version 2017-04-16 23:23:59 +02:00
ines
02e7512b91 Increment version 2017-04-16 22:39:58 +02:00
ines
16a8521efa Increment version 2017-04-16 22:38:38 +02:00
ines
de5062711b Update adding languages workflow to reflect changes in __init__.py 2017-04-16 22:26:46 +02:00
Matthew Honnibal
4efd6fb9d6 Fix training 2017-04-16 15:28:27 -05:00
Matthew Honnibal
17c9fffb9e Fix naked except 2017-04-16 15:28:16 -05:00
ines
5610fdcc06 Get language name first if no model path exists
Makes sure spaCy fails early if no tokenizer exists, and allows
printing better error message.
2017-04-16 22:16:47 +02:00
ines
ad168ba88c Set model name to empty string if path override exists
Required for parse_package_meta, which composes path of data_path and
model_name (needs to be fixed in the future)
2017-04-16 22:15:51 +02:00
ines
97647c46cd Add docstring and todo note 2017-04-16 22:14:45 +02:00
ines
5c5f8c0a72 Check if full string is found in lang classes first
This allows users to set arbitrary strings. (Otherwise, custom lang
class "my_custom_class" would always load Burmese "my" tokenizer if one
was available.)
2017-04-16 22:14:38 +02:00