Matthew Honnibal
581d318971
Fix conftest
2018-05-15 00:54:45 +02:00
Tahar Zanouda
00417794d3
Add Arabic language ( #2314 )
...
* added support for Arabic lang
* added Arabic language support
* updated conftest
2018-05-15 00:27:19 +02:00
Jani Monoses
0e08e49e87
Lemmatizer ro ( #2319 )
...
* Add Romanian lemmatizer lookup table.
Adapted from http://www.lexiconista.com/datasets/lemmatization/
by replacing cedillas with commas (ș and ț).
The original dataset is licensed under the Open Database License.
* Fix one blatant issue in the Romanian lemmatizer
* Romanian examples file
* Add ro_tokenizer in conftest
* Add Romanian lemmatizer test
2018-05-12 15:20:04 +02:00
vishnumenon
ae3719ece5
Fix the code for FACILITIY entities ( #2324 )
...
* Fix the code for FACILITIY entities
As far as I can tell, the default models all use "FAC" rather than "FACILITY"
* Added my Contributor Agreement
* Rename vishnumenon to vishnumenon.md
2018-05-12 15:19:17 +02:00
Matthew Honnibal
625ee6c464
Unhack travis.sh
2018-05-10 18:16:11 +02:00
Matthew Honnibal
299621b747
Try running sudo=true for travis
2018-05-10 18:11:11 +02:00
Matthew Honnibal
603907926f
Point thinc to libblas on Travis
2018-05-10 18:06:37 +02:00
Matthew Honnibal
1b294f4798
Tell Thinc to link against system blas on Travis
2018-05-10 18:03:44 +02:00
Matthew Honnibal
c261b5b996
Add some diagnostics to travis.yml to try to figure out why build fails
2018-05-10 17:10:44 +02:00
Matthew Honnibal
887631ca25
Disable some tests to figure out why CI fails
2018-05-10 16:42:01 +02:00
Matthew Honnibal
902a172cb7
Disable some tests to figure out why CI fails
2018-05-10 16:30:07 +02:00
Matthew Honnibal
614d45ea58
Set a more aggressive threshold on the max violn update
2018-05-10 15:38:24 +02:00
Matthew Honnibal
8e8724b55b
Default to beam_update_prob 1
2018-05-10 15:38:02 +02:00
Jani Monoses
42b34832e4
Update Romanian stopword list ( #2316 )
...
* Contributor agreement for janimo
* Update Romanian stopword list
Include the correct spellings of all the words already in the repo
that are using cedillas (ş and ţ) instead of commas (ș and ț).
Add another unrelated spelling fix.
See https://github.com/stopwords-iso/stopwords-ro/pull/1 and
https://github.com/stopwords-iso/stopwords-ro/pull/2
2018-05-10 12:16:56 +02:00
Lucas Abbade
18af53014f
Adding my contributor agreement ( #2315 )
...
* Create LRAbbade.md
* Update LRAbbade.md
2018-05-09 21:25:05 +02:00
Lucas Abbade
be7fdc59d1
Update lex_attrs.py ( #2307 )
...
* Update lex_attrs.py
Fixed spelling mistakes of some numbers (according to Brazilian Portuguese).
* Update lex_attrs.py
As requested, I've included the correct spelling for both Brazilian Portuguese and Portuguese Portuguese.
I will advise however, that the two are separated in the future. Brazilian Portuguese is a very different language from the original one, although most of the writing is unified, the way people talk in both countries is radically different. Keeping both languages as one may lead to bigger issues in the future, especially when it comes to spell checking.
2018-05-09 20:49:31 +02:00
mauryaland
5368ba028a
Update stop_words.py for French language ( #2310 )
...
* Add contraction forms of some common stopwords
All the stopwords added contain the apostrophe" ' "or " ’ ".
* Adds contributor agreement mauryaland
* Update mauryaland.md
2018-05-09 12:04:38 +02:00
Matthew Honnibal
a61fd60681
Fix error in beam gradient calculation
2018-05-09 02:44:09 +02:00
Matthew Honnibal
a6ae1ee6f7
Don't modify Token in global scope
2018-05-09 00:43:00 +02:00
Matthew Honnibal
f94f721f40
Avoid importing fused token symbol in ud-run-test, untl that's added
2018-05-09 00:28:03 +02:00
Matthew Honnibal
659ec5b975
Avoid importing fused token symbol in ud-run-test, untl that's added
2018-05-08 19:40:33 +02:00
Matthew Honnibal
4cb0494bef
Bug fixes to beam search after refactor
2018-05-08 13:48:50 +02:00
Matthew Honnibal
5ed71973b3
Add a keyword argument sink to GoldParse
2018-05-08 13:48:32 +02:00
Matthew Honnibal
8cfe326f87
Avoid relying on final gold check in beam search
2018-05-08 13:48:19 +02:00
Matthew Honnibal
fc4dd49b77
Support oracle segmentation in ud-train CLI command
2018-05-08 13:47:45 +02:00
Matthew Honnibal
c49e44349a
Fix beam parsing
2018-05-08 02:53:24 +02:00
Matthew Honnibal
99649d114d
Fix parser
2018-05-08 00:27:26 +02:00
Matthew Honnibal
8a82367a9d
Fix beam search after refactor
2018-05-08 00:20:33 +02:00
Matthew Honnibal
5a0f26be0c
Readd beam search after refactor
2018-05-08 00:19:52 +02:00
ines
7a3599c21a
Fix formatting and consistency
2018-05-07 23:02:11 +02:00
ines
37facf9b4d
Add config for no-response [ci skip]
2018-05-07 22:04:54 +02:00
ines
ac25bc4016
Add docs section on sentence segmentation [ci skip]
2018-05-07 21:25:20 +02:00
ines
14148cd147
Fix formatting and wording
2018-05-07 21:24:35 +02:00
ines
f803da609f
Add scattertext [ci skip]
2018-05-07 19:10:23 +02:00
ines
a685fff875
Merge branch 'master' of https://github.com/explosion/spaCy
2018-05-07 18:58:57 +02:00
Matthew Honnibal
36b2c9bdd5
Fix refactored parser
2018-05-07 18:58:09 +02:00
ines
e2241c797c
Add lock-threads configuration [ci skip]
2018-05-07 18:54:22 +02:00
Matthew Honnibal
bde3be1ad1
Fix refactored parser
2018-05-07 18:31:04 +02:00
B!
414f5270b3
B Cavello's signed Contributor Agreement v2 ( #2302 )
...
This time hopefully created in the right spot. (Sorry about that!)
2018-05-07 17:48:54 +02:00
Matthew Honnibal
01c4e13b02
Update test
2018-05-07 16:59:52 +02:00
Matthew Honnibal
f6cdafc00e
Fix refactored parser
2018-05-07 16:59:38 +02:00
Matthew Honnibal
3e3771c010
Compile updated parser
2018-05-07 15:54:27 +02:00
Matthew Honnibal
f56bd4736b
Improve dynamic oracle when values are missing in parse
2018-05-07 15:53:18 +02:00
Matthew Honnibal
eddc0e0c74
Set gold.sent_starts in ud_train
2018-05-07 15:52:47 +02:00
Matthew Honnibal
bf19f22340
Allow gold.sent_starts to be set from Python
2018-05-07 15:51:34 +02:00
Matthew Honnibal
7f163442e6
Work on refactoring greedy parser
2018-05-07 15:45:52 +02:00
Matt Upson
9a1d3b63fb
Add missing default to .set_extension ( #2297 )
...
Failing to set a default, method, or getter results in a ValueError:
ValueError: [E083] Error setting extension: only one of `default`, `method`, or `getter` (plus optional `setter`) is allowed. Got: 0
2018-05-04 18:47:01 +02:00
ines
929a01139a
Order issue templates
2018-05-04 03:04:41 +02:00
Ines Montani
7f39c8896b
Update issue templates ( #2295 )
...
* Update issue templates
* Update templates
2018-05-04 03:02:26 +02:00
Douglas Knox
9b49a40f4e
Test and fix for Issue #2219 ( #2272 )
...
Test and fix for Issue #2219 : Token.similarity() failed if single letter
2018-05-03 18:40:46 +02:00