Commit Graph

8852 Commits

Author SHA1 Message Date
Matthew Honnibal
be204ed714 Merge branch 'master' of https://github.com/explosion/spaCy 2017-04-07 15:50:14 +02:00
Matthew Honnibal
e7b1ee9efd Switch to regex module for URL identification
The URL detection regex was failing on input such as 0.1.2.3, as this
input triggered excessive back-tracking in the builtin re module.
The solution was to switch to the regex module, which behaves better.

Closes #913.
2017-04-07 15:47:36 +02:00
Matthew Honnibal
5887383fc0 Add test for Issue #913: Hang from bad regex 2017-04-07 15:47:27 +02:00
Matthew Honnibal
a001365c42 Require regex library 2017-04-07 15:43:34 +02:00
Matthew Honnibal
a5538d93d0 Merge pull request #955 from kumaranvpl/fix_keras_parikh_entailment_bugs
Fix keras_parikh_entailment example bugs
2017-04-07 14:59:57 +02:00
Ines Montani
2a60597089 Update CONTRIBUTORS.md 2017-04-07 13:34:05 +02:00
ines
7ea1673072 Fix whitespace 2017-04-07 13:28:48 +02:00
ines
2f38c1d77f Add documentation for new convert and model commands 2017-04-07 13:27:55 +02:00
ines
255650dbc2 Add connlu2json converter from explosion/spacy-dev-resources/#11 2017-04-07 13:05:12 +02:00
ines
789ce8a45e Add convert command 2017-04-07 13:04:17 +02:00
ines
9952d3b08a Fix whitespace 2017-04-07 13:02:05 +02:00
ines
47ddce6eb7 Remove unused variable 2017-04-07 13:01:48 +02:00
ines
7dd134718a Merge branch 'master' into develop 2017-04-07 12:00:26 +02:00
ines
dcf8ab0c47 Merge branch 'develop' 2017-04-07 12:00:09 +02:00
oeg
b10bc1a177 Adds contributor agreement dvsrepo 2017-04-07 11:58:28 +02:00
ines
f33c4cbae1 Add --no-cache-dir error to troubleshooting docs (see #958) 2017-04-07 10:22:18 +02:00
ines
d6bbc3ffcd Fix formatting 2017-04-07 10:22:18 +02:00
ines
75f9b4c6e2 Fix whitespace 2017-04-07 10:22:18 +02:00
oeg
c693d40791 feature(model): Add support for creating the Spanish model, including rich tagset, configuration, and basich tests 2017-04-06 18:48:45 +02:00
Matthew Honnibal
5e621b9862 Merge pull request #960 from recognai/master
Fixes typo in method calling Pseudoprojectivity method in create_pipeline method of BaseDefaults class
2017-04-06 17:57:27 +02:00
oeg
010293fb2f fix(typo): Fixes typo in method calling PseudoProjectivity.deprojectivize, failing with new train cli 2017-04-06 17:33:15 +02:00
Kumaran Rajendhiran
3f55d6afae Update README 2017-04-05 16:59:52 +05:30
Kumaran Rajendhiran
47d7137c83 Set max_length to 100 for demo and evaluate 2017-04-05 16:48:35 +05:30
Kumaran Rajendhiran
10e8dcdfdb Remove not needed parameters from function 2017-04-05 16:20:47 +05:30
ines
808cd6cf7f Add missing tags to verbs (resolves #948) 2017-04-03 18:12:52 +02:00
ines
2c36a61ec5 Add spacyr to libraries 2017-04-03 18:12:38 +02:00
Ines Montani
2de2195be8 Update CONTRIBUTORS.md 2017-04-01 10:39:42 +02:00
ines
ad8bf1829f Import and combine Portuguese tokenizer exceptions (see #943) 2017-04-01 10:37:42 +02:00
Ines Montani
f8b2d9c3b7 Merge pull request #943 from mamoit/master
Portuguese improvements
2017-04-01 10:32:00 +02:00
ines
3b667a24d4 Remove whitespace 2017-04-01 10:21:08 +02:00
ines
e71a1f4bd0 Fix download commands in error messages (see #946) 2017-04-01 10:20:57 +02:00
ines
42382d5692 Fix download commands in error messages (see #946) 2017-04-01 10:19:32 +02:00
ines
d4a59c254b Remove whitespace 2017-04-01 10:19:01 +02:00
Matthew Honnibal
51882ee2b8 Fix check for setting ent_id in merge 2017-03-31 19:32:01 +02:00
Miguel Almeida
4fde64c4ea Portuguese contractions and some abreviations 2017-03-31 15:52:55 +01:00
Miguel Almeida
465b240bcb Review Portuguese stop words
Mainly to review typos and add missing masculines/feminines
2017-03-31 13:00:47 +01:00
Matthew Honnibal
fc3900e5b2 Allow ent_id to be set in Token 2017-03-31 14:00:14 +02:00
Matthew Honnibal
9720103428 Improve attribute handlign in doc.merge(). Still unsatisfying 2017-03-31 13:59:58 +02:00
Matthew Honnibal
cfff4e0f61 Improve test 2017-03-31 13:59:32 +02:00
Matthew Honnibal
1bb7b4ca71 Add comment 2017-03-31 13:59:19 +02:00
Matthew Honnibal
725249c59a Add merge_phrase callback in matcher.pyx 2017-03-31 13:58:59 +02:00
Matthew Honnibal
e854f28304 Add test for Issue #758
Issue #758 occurs when no actions are available for a single token
doc after merging.
2017-03-31 13:26:25 +02:00
Miguel Almeida
c1d020b0a6 Remove "ista" from portuguese stop words 2017-03-31 12:26:13 +01:00
Miguel Almeida
17a1e7a119 Add Portuguese numbers and ordinals 2017-03-31 12:21:01 +01:00
Matthew Honnibal
47a3ef06a6 Unhack deprojetivization, moving it into pipeline
Previously the deprojectivize() call was attached to the transition
system, and only called for German. Instead it should be a separate
process, called after the parser. This makes it available for any
language. Closes #898.
2017-03-31 12:31:50 +02:00
Ines Montani
8eafe80450 Update CONTRIBUTORS.md 2017-03-31 09:12:31 +02:00
Ines Montani
045a8e994d Merge pull request #942 from jreeter/master (resolves #934)
Issue #934 symlink should not convert paths as_posix under windows.
2017-03-31 09:04:55 +02:00
Joshua Reeter
564daf6dec Issue #934 symlink should not convert paths as_posix under windows. 2017-03-30 23:47:45 -05:00
Matthew Honnibal
294718244f Set tb=native in pytest, to try to fix travis flakiness 2017-03-30 05:21:13 -05:00
Ines Montani
de44c22ded Merge pull request #941 from kinow/fix-stop-words-typos
Fix typos in Portuguese stop words
2017-03-30 11:41:31 +02:00