Matthew Honnibal
82810b9846
Update morphologizer
2020-06-09 23:32:07 +02:00
Matthew Honnibal
af1b5f129b
Use new example class in GoldCorpus
2020-06-09 23:31:19 +02:00
Matthew Honnibal
0714f1fa5c
Remove the 'pass example into __call__' thing
2020-06-09 23:30:06 +02:00
Matthew Honnibal
b3868cd1f8
Update NewExample
2020-06-09 23:06:48 +02:00
Matthew Honnibal
ccd332a9fc
Update test stubs
2020-06-09 15:49:04 +02:00
adrianeboyd
0a70bd6281
Bump version to 2.3.0.dev1 ( #5567 )
2020-06-09 15:47:31 +02:00
Matthew Honnibal
a20ac36bb7
Compile new modules
2020-06-09 15:44:17 +02:00
Matthew Honnibal
04569c0b3e
Fix import
2020-06-09 15:44:08 +02:00
Matthew Honnibal
f4caaa8ad9
Update alignment
2020-06-09 15:43:57 +02:00
Matthew Honnibal
b5ef397639
Add header for align.pxd
2020-06-09 15:43:48 +02:00
Matthew Honnibal
793092d2d8
Fix renaming in GoldCorpus
2020-06-09 15:43:38 +02:00
Matthew Honnibal
36d49a0f13
Fix NewExample class
2020-06-09 15:43:19 +02:00
Matthew Honnibal
f1189dc205
Draft tests for new Example class
2020-06-09 15:43:08 +02:00
Matthew Honnibal
c833ebe1ad
Start tests for new example class
2020-06-09 15:29:05 +02:00
Matthew Honnibal
453cfa14d0
Start drafting new example class
2020-06-09 15:28:42 +02:00
Matthew Honnibal
449000c234
Fix gold_io
2020-06-09 12:43:53 +02:00
Matthew Honnibal
cb08ce3936
Move alignment into Cython
2020-06-09 12:40:41 +02:00
Matthew Honnibal
20a1bdb298
Fix train
2020-06-09 12:33:29 +02:00
Matthew Honnibal
549164c31c
Fix corpus when no raw text supplied
2020-06-09 12:33:14 +02:00
adrianeboyd
b7e6e1b9a7
Disable sentence segmentation in ja tokenizer ( #5566 )
2020-06-09 12:00:59 +02:00
Sofie Van Landeghem
86112d2168
update issue manager's version
2020-06-09 08:57:38 +02:00
Matthew Honnibal
d9289712ba
* Make GoldCorpus return dict, not Example
...
* Make Example require a Doc object (previously optional)
Clarify methods in GoldCorpus
WIP refactor Example
Refactor Example.split_sents
Fix test
Fix augment
Update test
Update test
Fix import
Update test_scorer
Update Example
2020-06-09 01:01:59 +02:00
Matthew Honnibal
084271c9e9
Remove GoldParse from public API
...
* Move get_parses_from_example to spacy.syntax
* Get GoldParse out of Example
* Avoid expecting GoldParse input in parser
* Add Alignment to spacy.gold.align
* Update Example object
* Add comment
* Update pipeline
* Fix imports
* Simplify gold_io
* WIP on GoldCorpus
* Update test
* Xfail some gold tests
* Remove ignore_misaligned option from GoldCorpus
* Fix Example constructor
* Update test
* Fix usage of Example
* Add deprecated_get_gold method on Example
* Patch scorer
* Fix test
* Fix test
* Update tests
* Xfail a test
* Fix passing of make_projective
* Pass make_projective by default
* Hack data format in Example.from_dict
* Update tests
* Fix example.from_dict
* Update morphologizer
* Fix entity linker
* Add get_field to TokenAnnotation
* Fix Example.get_aligned
* Update test
* Fix alignment
* Fix corpus
* Fix GoldCorpus
* Handle misaligned
* Format
* Fix missing import
2020-06-08 22:09:57 +02:00
adrianeboyd
f162815f45
Handle empty and whitespace-only docs for Japanese ( #5564 )
...
Handle empty and whitespace-only docs in the custom alignment method
used by the Japanese tokenizer.
2020-06-08 21:09:23 +02:00
Martino Mensio
de00f967ce
adding spacy-universal-sentence-encoder ( #5534 )
...
* adding spacy-universal-sentence-encoder
* update affiliation
* updated code example
2020-06-08 20:26:30 +02:00
Sofie Van Landeghem
d1799da200
bot for answered issues ( #5563 )
...
* add tiangolo's issue manager
* fix formatting
* spaces, tabs, who knows
* formatting
* I'll get this right at some point
* maybe one more space ?
2020-06-08 19:47:32 +02:00
adrianeboyd
3bf111585d
Update Japanese tokenizer config and add serialization ( #5562 )
...
* Use `config` dict for tokenizer settings
* Add serialization of split mode setting
* Add tests for tokenizer split modes and serialization of split mode
setting
Based on #5561
2020-06-08 16:29:05 +02:00
Hiroshi Matsuda
456bf47f51
fix a bug causing mis-alignments ( #5560 )
2020-06-08 15:49:34 +02:00
Matthew Honnibal
b69fa77ccc
Add missing inits
2020-06-06 15:38:46 +02:00
Matthew Honnibal
6e87ca1f45
Fix imports
2020-06-06 15:36:58 +02:00
Matthew Honnibal
53b00991fd
Fix imports
2020-06-06 15:36:46 +02:00
Matthew Honnibal
74204116a3
Rename _gold -> gold
2020-06-06 15:29:32 +02:00
Matthew Honnibal
7f135736f4
Fix imports
2020-06-06 15:28:52 +02:00
Matthew Honnibal
17533a9286
Format
2020-06-06 15:13:07 +02:00
Matthew Honnibal
0f9b4bbfea
Fix imports
2020-06-06 15:12:52 +02:00
Matthew Honnibal
866179350b
Fix import
2020-06-06 15:11:13 +02:00
Matthew Honnibal
3baa1ada03
Refactr spacy.gold
2020-06-06 15:10:33 +02:00
Matthew Honnibal
1d2e39d974
Support to_dict in Doc
2020-06-06 15:10:10 +02:00
Matthew Honnibal
7b873ce2b1
Move GoldParse under spacy.syntax
2020-06-06 15:09:43 +02:00
Matthew Honnibal
32c8fb1372
Add gold_io.pyx
2020-06-06 14:41:49 +02:00
Matthew Honnibal
156466ca69
Add iob_utils
2020-06-06 14:39:14 +02:00
Matthew Honnibal
53e6473e24
Add to/from dict helpers
2020-06-06 14:29:06 +02:00
Matthew Honnibal
a663d44b1b
Add GoldCorpus
2020-06-06 14:28:37 +02:00
Matthew Honnibal
1fb8fc6ea9
Add Example class
2020-06-06 14:24:35 +02:00
Matthew Honnibal
cce6a51a9c
Add annotation classes
2020-06-06 14:22:27 +02:00
Matthew Honnibal
6005b94e74
Add data augmentation
2020-06-06 14:19:06 +02:00
Matthew Honnibal
fcb4f7a6db
Start breaking down gold.pyx
2020-06-06 14:15:12 +02:00
adrianeboyd
009119fa66
Requirements/setup for Japanese ( #5553 )
...
* Add sudachipy and sudachidict_core to Makefile
* Switch ja requirements from fugashi to sudachipy
2020-06-06 00:22:18 +02:00
Ines Montani
d93cbeb14f
Add warning for loose version constraints ( #5536 )
...
* Add warning for loose version constraints
* Update wording [ci skip]
* Tweak error message
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2020-06-05 12:42:15 +02:00
adrianeboyd
1ac43d78f9
Avoid libc.stdint for UINT64_MAX ( #5545 )
2020-06-04 20:02:05 +02:00