Commit Graph

2698 Commits

Author SHA1 Message Date
ines
956dc36785 Move functions to deprecated 2017-04-15 12:12:31 +02:00
ines
c05ec4b89a Add compat functions and remove old workarounds
Add ensure_path util function to handle checking instance of path
2017-04-15 12:11:16 +02:00
ines
26445ee304 Add compat module for Python2/3 and platform compatibility 2017-04-15 12:07:02 +02:00
ines
d24589aa72 Clean up imports, unused code, whitespace, docstrings 2017-04-15 12:05:47 +02:00
ines
561f2a3eb4 Use consistent formatting for docstrings 2017-04-15 11:59:21 +02:00
Matthew Honnibal
d13f0a7017 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-04-14 23:54:57 +02:00
Matthew Honnibal
354458484c WIP on add_label bug during NER training
Currently when a new label is introduced to NER during training,
it causes the labels to be read in in an unexpected order. This
invalidates the model.
2017-04-14 23:52:17 +02:00
Matthew Honnibal
33ba5066eb Refactor Language.end_training, making new save_to_directory method 2017-04-14 23:51:24 +02:00
ines
84341c2975 Only compile list of models if data_path exists 2017-04-14 16:48:02 +02:00
Gyorgy Orosz
dd3244c08a Made json dump to produce unicode strings in py2 2017-04-13 23:30:47 +02:00
Gyorgy Orosz
a9469c8173 Fixed typo 2017-04-13 15:24:14 +02:00
ines
41037f0f07 Remove unused imports 2017-04-13 13:52:11 +02:00
ines
1b92c8d5d5 Use unicode paths on Windows/Python 2 and catch other errors (resolves #970)
try/except here is quite dirty, but it'll at least make sure users see
an error message that explains what's going on
2017-04-10 17:49:51 +02:00
Matthew Honnibal
49e2de900e Add costs property to StepwiseState, to show which moves are gold. 2017-04-10 11:37:04 +02:00
Matthew Honnibal
e26577b202 Increment version 2017-04-07 18:45:06 +02:00
Matthew Honnibal
40bf7ecf27 Increment version 2017-04-07 18:44:20 +02:00
Matthew Honnibal
1dca7eeb03 Add unicode declaration on new regression test 2017-04-07 18:09:23 +02:00
ines
887827fc6a Merge branch 'develop' 2017-04-07 17:36:23 +02:00
ines
444dd511c5 Fix xpassing URL test case 2017-04-07 17:36:05 +02:00
ines
bf0f15e762 Add / to tokenizer infixes (resolves #891) 2017-04-07 17:30:44 +02:00
ines
00b9011a49 Fix whitespace 2017-04-07 17:29:59 +02:00
ines
f9869e4dc5 Merge branch 'master' into develop 2017-04-07 17:23:40 +02:00
Matthew Honnibal
4a6204dbad Merge remote-tracking branch 'origin/develop' 2017-04-07 17:20:09 +02:00
Matthew Honnibal
0513c43bf0 Merge branch 'master' of https://github.com/explosion/spaCy 2017-04-07 17:07:10 +02:00
Matthew Honnibal
cc36c308f4 Fix noun_chunk rules around coordination
Closes #693.
2017-04-07 17:06:40 +02:00
Matthew Honnibal
ab846256cf Merge pull request #966 from recognai/master
Prepare Spanish language for training models, including configuration, rich-UD tag map and tests
2017-04-07 16:12:29 +02:00
Matthew Honnibal
83dca920d4 Rename test #913 -> #957, comment
Make test for #957 reference correct bug. Add comment.

Previous commit closes #957.
2017-04-07 15:54:25 +02:00
Matthew Honnibal
be204ed714 Merge branch 'master' of https://github.com/explosion/spaCy 2017-04-07 15:50:14 +02:00
Matthew Honnibal
e7b1ee9efd Switch to regex module for URL identification
The URL detection regex was failing on input such as 0.1.2.3, as this
input triggered excessive back-tracking in the builtin re module.
The solution was to switch to the regex module, which behaves better.

Closes #913.
2017-04-07 15:47:36 +02:00
Matthew Honnibal
5887383fc0 Add test for Issue #913: Hang from bad regex 2017-04-07 15:47:27 +02:00
ines
7ea1673072 Fix whitespace 2017-04-07 13:28:48 +02:00
ines
255650dbc2 Add connlu2json converter from explosion/spacy-dev-resources/#11 2017-04-07 13:05:12 +02:00
ines
789ce8a45e Add convert command 2017-04-07 13:04:17 +02:00
ines
9952d3b08a Fix whitespace 2017-04-07 13:02:05 +02:00
ines
47ddce6eb7 Remove unused variable 2017-04-07 13:01:48 +02:00
ines
dcf8ab0c47 Merge branch 'develop' 2017-04-07 12:00:09 +02:00
ines
75f9b4c6e2 Fix whitespace 2017-04-07 10:22:18 +02:00
oeg
c693d40791 feature(model): Add support for creating the Spanish model, including rich tagset, configuration, and basich tests 2017-04-06 18:48:45 +02:00
oeg
010293fb2f fix(typo): Fixes typo in method calling PseudoProjectivity.deprojectivize, failing with new train cli 2017-04-06 17:33:15 +02:00
ines
808cd6cf7f Add missing tags to verbs (resolves #948) 2017-04-03 18:12:52 +02:00
ines
ad8bf1829f Import and combine Portuguese tokenizer exceptions (see #943) 2017-04-01 10:37:42 +02:00
Ines Montani
f8b2d9c3b7 Merge pull request #943 from mamoit/master
Portuguese improvements
2017-04-01 10:32:00 +02:00
ines
3b667a24d4 Remove whitespace 2017-04-01 10:21:08 +02:00
ines
e71a1f4bd0 Fix download commands in error messages (see #946) 2017-04-01 10:20:57 +02:00
ines
42382d5692 Fix download commands in error messages (see #946) 2017-04-01 10:19:32 +02:00
ines
d4a59c254b Remove whitespace 2017-04-01 10:19:01 +02:00
Matthew Honnibal
51882ee2b8 Fix check for setting ent_id in merge 2017-03-31 19:32:01 +02:00
Miguel Almeida
4fde64c4ea Portuguese contractions and some abreviations 2017-03-31 15:52:55 +01:00
Miguel Almeida
465b240bcb Review Portuguese stop words
Mainly to review typos and add missing masculines/feminines
2017-03-31 13:00:47 +01:00
Matthew Honnibal
fc3900e5b2 Allow ent_id to be set in Token 2017-03-31 14:00:14 +02:00