Commit Graph

8144 Commits

Author SHA1 Message Date
Matthew Honnibal
e7b1ee9efd Switch to regex module for URL identification
The URL detection regex was failing on input such as 0.1.2.3, as this
input triggered excessive back-tracking in the builtin re module.
The solution was to switch to the regex module, which behaves better.

Closes #913.
2017-04-07 15:47:36 +02:00
Matthew Honnibal
5887383fc0 Add test for Issue #913: Hang from bad regex 2017-04-07 15:47:27 +02:00
Matthew Honnibal
a001365c42 Require regex library 2017-04-07 15:43:34 +02:00
Matthew Honnibal
a5538d93d0 Merge pull request #955 from kumaranvpl/fix_keras_parikh_entailment_bugs
Fix keras_parikh_entailment example bugs
2017-04-07 14:59:57 +02:00
Ines Montani
2a60597089 Update CONTRIBUTORS.md 2017-04-07 13:34:05 +02:00
ines
7ea1673072 Fix whitespace 2017-04-07 13:28:48 +02:00
ines
2f38c1d77f Add documentation for new convert and model commands 2017-04-07 13:27:55 +02:00
ines
255650dbc2 Add connlu2json converter from explosion/spacy-dev-resources/#11 2017-04-07 13:05:12 +02:00
ines
789ce8a45e Add convert command 2017-04-07 13:04:17 +02:00
ines
9952d3b08a Fix whitespace 2017-04-07 13:02:05 +02:00
ines
47ddce6eb7 Remove unused variable 2017-04-07 13:01:48 +02:00
ines
7dd134718a Merge branch 'master' into develop 2017-04-07 12:00:26 +02:00
ines
dcf8ab0c47 Merge branch 'develop' 2017-04-07 12:00:09 +02:00
oeg
b10bc1a177 Adds contributor agreement dvsrepo 2017-04-07 11:58:28 +02:00
ines
f33c4cbae1 Add --no-cache-dir error to troubleshooting docs (see #958) 2017-04-07 10:22:18 +02:00
ines
d6bbc3ffcd Fix formatting 2017-04-07 10:22:18 +02:00
ines
75f9b4c6e2 Fix whitespace 2017-04-07 10:22:18 +02:00
oeg
c693d40791 feature(model): Add support for creating the Spanish model, including rich tagset, configuration, and basich tests 2017-04-06 18:48:45 +02:00
Matthew Honnibal
5e621b9862 Merge pull request #960 from recognai/master
Fixes typo in method calling Pseudoprojectivity method in create_pipeline method of BaseDefaults class
2017-04-06 17:57:27 +02:00
oeg
010293fb2f fix(typo): Fixes typo in method calling PseudoProjectivity.deprojectivize, failing with new train cli 2017-04-06 17:33:15 +02:00
Kumaran Rajendhiran
3f55d6afae Update README 2017-04-05 16:59:52 +05:30
Kumaran Rajendhiran
47d7137c83 Set max_length to 100 for demo and evaluate 2017-04-05 16:48:35 +05:30
Kumaran Rajendhiran
10e8dcdfdb Remove not needed parameters from function 2017-04-05 16:20:47 +05:30
ines
808cd6cf7f Add missing tags to verbs (resolves #948) 2017-04-03 18:12:52 +02:00
ines
2c36a61ec5 Add spacyr to libraries 2017-04-03 18:12:38 +02:00
Ines Montani
2de2195be8 Update CONTRIBUTORS.md 2017-04-01 10:39:42 +02:00
ines
ad8bf1829f Import and combine Portuguese tokenizer exceptions (see #943) 2017-04-01 10:37:42 +02:00
Ines Montani
f8b2d9c3b7 Merge pull request #943 from mamoit/master
Portuguese improvements
2017-04-01 10:32:00 +02:00
ines
3b667a24d4 Remove whitespace 2017-04-01 10:21:08 +02:00
ines
e71a1f4bd0 Fix download commands in error messages (see #946) 2017-04-01 10:20:57 +02:00
ines
42382d5692 Fix download commands in error messages (see #946) 2017-04-01 10:19:32 +02:00
ines
d4a59c254b Remove whitespace 2017-04-01 10:19:01 +02:00
Matthew Honnibal
51882ee2b8 Fix check for setting ent_id in merge 2017-03-31 19:32:01 +02:00
Miguel Almeida
4fde64c4ea Portuguese contractions and some abreviations 2017-03-31 15:52:55 +01:00
Miguel Almeida
465b240bcb Review Portuguese stop words
Mainly to review typos and add missing masculines/feminines
2017-03-31 13:00:47 +01:00
Matthew Honnibal
fc3900e5b2 Allow ent_id to be set in Token 2017-03-31 14:00:14 +02:00
Matthew Honnibal
9720103428 Improve attribute handlign in doc.merge(). Still unsatisfying 2017-03-31 13:59:58 +02:00
Matthew Honnibal
cfff4e0f61 Improve test 2017-03-31 13:59:32 +02:00
Matthew Honnibal
1bb7b4ca71 Add comment 2017-03-31 13:59:19 +02:00
Matthew Honnibal
725249c59a Add merge_phrase callback in matcher.pyx 2017-03-31 13:58:59 +02:00
Matthew Honnibal
e854f28304 Add test for Issue #758
Issue #758 occurs when no actions are available for a single token
doc after merging.
2017-03-31 13:26:25 +02:00
Miguel Almeida
c1d020b0a6 Remove "ista" from portuguese stop words 2017-03-31 12:26:13 +01:00
Miguel Almeida
17a1e7a119 Add Portuguese numbers and ordinals 2017-03-31 12:21:01 +01:00
Matthew Honnibal
47a3ef06a6 Unhack deprojetivization, moving it into pipeline
Previously the deprojectivize() call was attached to the transition
system, and only called for German. Instead it should be a separate
process, called after the parser. This makes it available for any
language. Closes #898.
2017-03-31 12:31:50 +02:00
Ines Montani
8eafe80450 Update CONTRIBUTORS.md 2017-03-31 09:12:31 +02:00
Ines Montani
045a8e994d Merge pull request #942 from jreeter/master (resolves #934)
Issue #934 symlink should not convert paths as_posix under windows.
2017-03-31 09:04:55 +02:00
Joshua Reeter
564daf6dec Issue #934 symlink should not convert paths as_posix under windows. 2017-03-30 23:47:45 -05:00
Matthew Honnibal
294718244f Set tb=native in pytest, to try to fix travis flakiness 2017-03-30 05:21:13 -05:00
Ines Montani
de44c22ded Merge pull request #941 from kinow/fix-stop-words-typos
Fix typos in Portuguese stop words
2017-03-30 11:41:31 +02:00
Bruno P. Kinoshita
c2d48974bc Fix typos in Portuguese stop words 2017-03-30 21:59:18 +13:00