Matthew Honnibal
bde3be1ad1
Fix refactored parser
2018-05-07 18:31:04 +02:00
B!
414f5270b3
B Cavello's signed Contributor Agreement v2 ( #2302 )
...
This time hopefully created in the right spot. (Sorry about that!)
2018-05-07 17:48:54 +02:00
Matthew Honnibal
01c4e13b02
Update test
2018-05-07 16:59:52 +02:00
Matthew Honnibal
f6cdafc00e
Fix refactored parser
2018-05-07 16:59:38 +02:00
Matthew Honnibal
3e3771c010
Compile updated parser
2018-05-07 15:54:27 +02:00
Matthew Honnibal
f56bd4736b
Improve dynamic oracle when values are missing in parse
2018-05-07 15:53:18 +02:00
Matthew Honnibal
eddc0e0c74
Set gold.sent_starts in ud_train
2018-05-07 15:52:47 +02:00
Matthew Honnibal
bf19f22340
Allow gold.sent_starts to be set from Python
2018-05-07 15:51:34 +02:00
Matthew Honnibal
7f163442e6
Work on refactoring greedy parser
2018-05-07 15:45:52 +02:00
Matt Upson
9a1d3b63fb
Add missing default to .set_extension ( #2297 )
...
Failing to set a default, method, or getter results in a ValueError:
ValueError: [E083] Error setting extension: only one of `default`, `method`, or `getter` (plus optional `setter`) is allowed. Got: 0
2018-05-04 18:47:01 +02:00
ines
929a01139a
Order issue templates
2018-05-04 03:04:41 +02:00
Ines Montani
7f39c8896b
Update issue templates ( #2295 )
...
* Update issue templates
* Update templates
2018-05-04 03:02:26 +02:00
Douglas Knox
9b49a40f4e
Test and fix for Issue #2219 ( #2272 )
...
Test and fix for Issue #2219 : Token.similarity() failed if single letter
2018-05-03 18:40:46 +02:00
Paul O'Leary McCann
bd72fbf09c
Port Japanese mecab tokenizer from v1 ( #2036 )
...
* Port Japanese mecab tokenizer from v1
This brings the Mecab-based Japanese tokenization introduced in #1246 to
spaCy v2. There isn't a JapaneseTagger implementation yet, but POS tag
information from Mecab is stored in a token extension. A tag map is also
included.
As a reminder, Mecab is required because Universal Dependencies are
based on Unidic tags, and Janome doesn't support Unidic.
Things to check:
1. Is this the right way to use a token extension?
2. What's the right way to implement a JapaneseTagger? The approach in
#1246 relied on `tag_from_strings` which is just gone now. I guess the
best thing is to just try training spaCy's default Tagger?
-POLM
* Add tagging/make_doc and tests
2018-05-03 18:38:26 +02:00
G.Pruvost
cc8e804648
#2211 - Support for ssl certs config on download command ( #2212 )
...
* Add support for SSL/Certs customization on download CLI
* Add a note on SSL options for the 'download' CLI in the README
* Add contributor agreement
2018-05-03 18:37:02 +02:00
Jens Dahl Møllerhøj
b9290397fb
rename SP to _SP ( #2289 )
2018-05-03 18:33:49 +02:00
ines
c9547b7b8b
Update Juniper (see #2293 )
2018-05-03 15:36:02 +02:00
Matthew Honnibal
a8e70a4187
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2018-05-03 14:02:10 +02:00
Matthew Honnibal
c0e596283b
Set version to 2.1.0a0
2018-05-03 14:00:11 +02:00
Alex Villarreal
647f2544c5
Fix code sample for span.set_extension ( #2286 )
2018-05-03 00:39:22 +02:00
Matthew Honnibal
8cd06cc763
Try to fix root-outside-sentence bug
2018-05-02 14:39:48 +00:00
Matthew Honnibal
acebd01033
Set cildren from heads in finalize doc
2018-05-02 14:19:22 +00:00
Alex Villarreal
13d562e1a4
Fix code sample for Doc.set_extension ( #2282 )
...
* Fix code sample for `set_extension`
The previous sample code for `set_extension` fails the assertion at the end, because `city_getter` it checked if the whole document text matches any of the city names. Now it checks if any of the city names is contained in the document text.
* Contributor agreement
2018-05-02 10:16:05 +02:00
Matthew Honnibal
569440a6db
Dont normalize gradient by batch size
2018-05-02 08:42:10 +02:00
Matthew Honnibal
281e29cbcd
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2018-05-02 01:36:23 +00:00
Matthew Honnibal
2338e8c7fc
Update develop from master
2018-05-02 01:36:12 +00:00
Matthew Honnibal
9d147e12c4
Merge remote-tracking branch 'origin/master' into develop
2018-05-01 18:18:51 +02:00
Matthew Honnibal
8562faeb39
Fix conll2017 fab command
2018-05-01 18:04:58 +02:00
Matthew Honnibal
116ae46802
Improve experiment management
2018-05-01 17:51:22 +02:00
Matthew Honnibal
6d0fe67b72
Constrain subtok label to adjacent tokens
2018-05-01 17:34:27 +02:00
Matthew Honnibal
8f21953fc5
Constrain subtok to adjacent words
2018-05-01 17:29:00 +02:00
Matthew Honnibal
b43bfd3524
Fix arc-eager oracle tests
2018-05-01 16:16:14 +02:00
Matthew Honnibal
31ed64e9b0
Fix textcat test
2018-05-01 15:18:39 +02:00
Matthew Honnibal
548bdff943
Update default Adam settings
2018-05-01 15:18:20 +02:00
Matthew Honnibal
adbb1f7533
Add better arc-eager oracle tests
2018-05-01 15:14:55 +02:00
Matthew Honnibal
697bcaa34f
Add some methods to ArcEager that make testing easier
2018-05-01 15:13:14 +02:00
Matthew Honnibal
a5f6d69f8a
Require new dev build of Thinc
2018-05-01 15:05:00 +02:00
Mr Roboto
6f5ccda19c
Addresses Issue #2228 - Deserialization fails when using tensor=False or sentiment=False ( #2230 )
...
* Fixes issue #2228
* Adds a new contributor
2018-05-01 13:40:22 +02:00
Matthew Honnibal
d44bb45c72
Fix scoring if tokenization changes
2018-05-01 01:33:20 +02:00
Shirish Kadam
d98a90440f
Added Adam project to spaCy Universe ( #2275 )
...
* Added 5hirish to contributors
* Added Adam Qas Project to spaCy Universe
* Remove $ from code example
2018-04-30 22:25:01 +02:00
ines
56e7faf16b
Fix spacing
2018-04-30 22:24:40 +02:00
ines
6efb4cdf88
Use Juniper and tidy up
2018-04-30 18:48:35 +02:00
Matthew Honnibal
2b26c007cd
Revert "Disable batch size compounding in ud-train"
...
This reverts commit 8a120fb455
.
2018-04-29 14:09:02 +00:00
Matthew Honnibal
723b328062
Add script to run UD test
2018-04-29 15:50:25 +02:00
Matthew Honnibal
17af6aa3a4
Update ud_train script
2018-04-29 15:49:32 +02:00
Matthew Honnibal
5de8a36537
Fix arc_eager is_nonproj_tree
2018-04-29 15:49:11 +02:00
Matthew Honnibal
5260268f70
Fix textcat after merge
2018-04-29 15:48:53 +02:00
Matthew Honnibal
ad3d56c3ba
Fix compile error in matcher
2018-04-29 15:48:34 +02:00
Matthew Honnibal
a8bc947fd4
Fix Token.set_extension
2018-04-29 15:48:19 +02:00
Matthew Honnibal
535833a38d
Fix syntax error in setup.py
2018-04-29 15:47:54 +02:00