Matthew Honnibal
e634bae69e
Fix Corpus
2020-06-22 00:54:38 +02:00
Matthew Honnibal
3354758351
Remove Example.doc property
...
Remove Example.doc
Remove Example.doc
Remove Example.doc
Remove Example.doc
2020-06-22 00:54:38 +02:00
Matthew Honnibal
7d329cd1ac
Add kwargs to Corpus.dev_dataset to match train_dataset
2020-06-22 00:54:38 +02:00
Matthew Honnibal
59098a5f62
Add get_aligned_parse method in Example
...
Fix Example.get_aligned_parse
2020-06-22 00:54:38 +02:00
Matthew Honnibal
75a5f2d499
Remove GoldCorpus
...
Update imports
Update after removing GoldCorpus
Fix module name of corpus
Fix mimport
2020-06-22 00:54:38 +02:00
Matthew Honnibal
17226a60ac
Draft Corpus class for DocBin
...
Update Corpus
Fix Corpus
2020-06-22 00:51:22 +02:00
Matthew Honnibal
a5ebfb20f5
Serialize all attrs by default
...
Move converters under spacy.gold
Move things around
Fix naming
Fix name
Update converter to produce DocBin
Update converters
Make spacy convert output docbin
Fix import
Fix docbin
Fix import
Update converter
Remove jsonl converter
Add json2docs converter
2020-06-22 00:46:08 +02:00
svlandeg
6d5bfd6f6a
fix test checking for variants
2020-06-22 00:46:08 +02:00
svlandeg
5477bf054f
add links to to_dict
2020-06-22 00:46:08 +02:00
Matthew Honnibal
03db143cd0
Draft new GoldCorpus class
2020-06-19 04:15:02 +02:00
svlandeg
1951921230
implement split_sent with aligned SENT_START attribute
2020-06-18 19:41:53 +02:00
svlandeg
d1d6f16776
fix the fix
2020-06-18 19:15:32 +02:00
svlandeg
e822367cf7
prevent writing dummy values like deps because that could interfer with sent_start values
2020-06-18 17:47:59 +02:00
svlandeg
0c6f1f3891
fix BiluoPushDown parsing entities
2020-06-18 13:00:03 +02:00
svlandeg
40b2b21eef
small bug fix
2020-06-17 23:33:51 +02:00
svlandeg
6d73e139b0
fix entity linker
2020-06-17 21:12:25 +02:00
svlandeg
10d396977e
add support for MORPH in to/from_array, fix morphologizer overfitting test
2020-06-17 17:48:07 +02:00
svlandeg
1a151b10d6
correct silly typo
2020-06-17 14:48:14 +02:00
svlandeg
2d9f406188
fix test_cli
2020-06-17 14:42:48 +02:00
svlandeg
f7ad8e8c83
various fixes in scripts - needs to be further tested
2020-06-17 12:05:58 +02:00
svlandeg
3c4f9e4cc4
fix augment (needs further testing)
2020-06-17 10:46:29 +02:00
svlandeg
4ed399c848
minibatch utiltiy can deal with strings, docs or examples
2020-06-16 21:35:55 +02:00
svlandeg
8b66c11ff2
add spaces to json output format
2020-06-16 19:30:03 +02:00
svlandeg
ba80ad7efd
fixed some tests + WIP roundtrip unit test
2020-06-16 18:26:50 +02:00
svlandeg
43d41d6bb6
allow None as BILUO annotation
2020-06-16 15:30:05 +02:00
svlandeg
44a0f9c2c8
test_gold_biluo_different_tokenization works
2020-06-16 15:21:20 +02:00
svlandeg
1c35b8efcd
fix spaces
2020-06-16 12:08:25 +02:00
svlandeg
6fea5fa4bd
attempt to fix cases with weird spaces
2020-06-16 11:52:29 +02:00
svlandeg
a28f8f369e
Fix many-to-one IOB codes
2020-06-15 23:06:22 +02:00
svlandeg
12886b787b
fixing NER one-to-many alignment
2020-06-15 22:44:17 +02:00
Matthew Honnibal
a0bf73a5dd
Merge branch 'whatif/arrow' of https://github.com/explosion/spaCy into whatif/arrow
2020-06-15 18:16:01 +02:00
Matthew Honnibal
c95494739c
Fix import
2020-06-15 18:11:10 +02:00
Matthew Honnibal
8f978f2031
Fix import
2020-06-15 18:10:47 +02:00
svlandeg
68986a252e
additional tests for new get_aligned function
2020-06-15 17:42:40 +02:00
svlandeg
41d29983a7
start testing get_aligned
2020-06-15 17:16:01 +02:00
svlandeg
fd5f199feb
fixing language and scoring tests
2020-06-15 15:02:05 +02:00
svlandeg
a48553c1ed
fix error numbers
2020-06-15 08:51:31 +02:00
Matthew Honnibal
98ca14f577
Remove GoldParse
...
WIP on removing goldparse
Get ArcEager compiling after GoldParse excise
Update setup.py
Get spacy.syntax compiling after removing GoldParse
Rename NewExample -> Example and clean up
Clean html files
Start updating tests
Update Morphologizer
2020-06-14 19:53:30 +02:00
Matthew Honnibal
3a0bbcfb4c
Add biluo_tags_from_doc function
2020-06-13 23:10:54 +02:00
Matthew Honnibal
caa7508725
Draft missing NewExample stuff
2020-06-13 23:10:21 +02:00
Matthew Honnibal
5564314d32
Suggest approach for GoldParse
2020-06-13 15:43:35 +02:00
Matthew Honnibal
b078b05ecd
Handle various data better in NewExample
2020-06-13 15:30:12 +02:00
svlandeg
face0de74f
fix MORPH conversion + enable unit test
2020-06-12 16:29:09 +02:00
svlandeg
a5ee082da1
cats bugfix
2020-06-12 15:49:38 +02:00
svlandeg
880dccf93e
entities on doc_annotation, parse links and check their offsets against the entities. unit test works
2020-06-12 15:47:20 +02:00
svlandeg
3aed177a35
fix ENT_IOB conversion and enable unit test
2020-06-12 11:30:24 +02:00
svlandeg
6a67a11682
adding tests for new example class (some still failing - WIP)
2020-06-11 17:43:40 +02:00
Matthew Honnibal
337d2b5ad6
Fix sent start in NewExample
2020-06-09 23:58:16 +02:00
Matthew Honnibal
af1b5f129b
Use new example class in GoldCorpus
2020-06-09 23:31:19 +02:00
Matthew Honnibal
b3868cd1f8
Update NewExample
2020-06-09 23:06:48 +02:00