Matthew Honnibal
|
2efe01bf26
|
Fix parser declaration
|
2020-06-22 00:54:38 +02:00 |
|
Matthew Honnibal
|
29d39d8a34
|
Update header
|
2020-06-22 00:54:38 +02:00 |
|
Matthew Honnibal
|
456e27dc8b
|
Start debugging arc_eager oracle
|
2020-06-22 00:54:38 +02:00 |
|
Matthew Honnibal
|
b60eede321
|
Fix parser model
|
2020-06-22 00:54:38 +02:00 |
|
Matthew Honnibal
|
17efd6bfec
|
Update train.py
|
2020-06-22 00:54:38 +02:00 |
|
Matthew Honnibal
|
49145b9ec1
|
Update DocBin
Add missing strings when serializing
|
2020-06-22 00:54:35 +02:00 |
|
Matthew Honnibal
|
17226a60ac
|
Draft Corpus class for DocBin
Update Corpus
Fix Corpus
|
2020-06-22 00:51:22 +02:00 |
|
Matthew Honnibal
|
6e7a7ab6da
|
Work on train script
|
2020-06-22 00:48:09 +02:00 |
|
Matthew Honnibal
|
a5ebfb20f5
|
Serialize all attrs by default
Move converters under spacy.gold
Move things around
Fix naming
Fix name
Update converter to produce DocBin
Update converters
Make spacy convert output docbin
Fix import
Fix docbin
Fix import
Update converter
Remove jsonl converter
Add json2docs converter
|
2020-06-22 00:46:08 +02:00 |
|
Matthew Honnibal
|
5467cb4aae
|
Allow DocBin to take list of Doc objects.
|
2020-06-22 00:46:08 +02:00 |
|
Matthew Honnibal
|
d422f30a18
|
Start updating converters
|
2020-06-22 00:46:12 +02:00 |
|
svlandeg
|
6d5bfd6f6a
|
fix test checking for variants
|
2020-06-22 00:46:08 +02:00 |
|
svlandeg
|
a427ca9355
|
clean up
|
2020-06-22 00:46:08 +02:00 |
|
svlandeg
|
5477bf054f
|
add links to to_dict
|
2020-06-22 00:46:08 +02:00 |
|
Matthew Honnibal
|
39117de4f9
|
Fix compile in ArcEager
|
2020-06-22 00:46:08 +02:00 |
|
Matthew Honnibal
|
e2279eab1c
|
Make doc.from_array several times faster
|
2020-06-22 00:46:08 +02:00 |
|
Matthew Honnibal
|
de32515bf8
|
Allocate Doc before starting to add words
|
2020-06-22 00:46:08 +02:00 |
|
Matthew Honnibal
|
be81577719
|
Fix oracles
|
2020-06-20 02:36:12 +02:00 |
|
Matthew Honnibal
|
03db143cd0
|
Draft new GoldCorpus class
|
2020-06-19 04:15:02 +02:00 |
|
Matthew Honnibal
|
a389866df6
|
Merge branch 'whatif/arrow' of https://github.com/explosion/spaCy into whatif/arrow
|
2020-06-19 02:30:27 +02:00 |
|
Matthew Honnibal
|
bd29b7b14f
|
Update parser and NER gold stuff
|
2020-06-19 02:29:16 +02:00 |
|
Matthew Honnibal
|
5ae9e3480d
|
Return ArcEagerGoldParse from ArcEager
|
2020-06-19 00:11:59 +02:00 |
|
svlandeg
|
6ca6d7d6b4
|
test for split sentences with various alignment issues, works
|
2020-06-18 20:01:02 +02:00 |
|
svlandeg
|
1951921230
|
implement split_sent with aligned SENT_START attribute
|
2020-06-18 19:41:53 +02:00 |
|
svlandeg
|
d1d6f16776
|
fix the fix
|
2020-06-18 19:15:32 +02:00 |
|
svlandeg
|
e822367cf7
|
prevent writing dummy values like deps because that could interfer with sent_start values
|
2020-06-18 17:47:59 +02:00 |
|
svlandeg
|
0b6d45eae1
|
various small fixes
|
2020-06-18 15:55:00 +02:00 |
|
svlandeg
|
1c71f2310c
|
fix renames and simple_ner labels
|
2020-06-18 15:33:28 +02:00 |
|
svlandeg
|
64fc840a5d
|
bugfix tok2vec
|
2020-06-18 15:24:40 +02:00 |
|
svlandeg
|
01f9ae774c
|
small fixes
|
2020-06-18 14:01:19 +02:00 |
|
svlandeg
|
0c6f1f3891
|
fix BiluoPushDown parsing entities
|
2020-06-18 13:00:03 +02:00 |
|
svlandeg
|
cd790aaa2a
|
fix parser tests to work with example (most still failing)
|
2020-06-18 11:19:22 +02:00 |
|
svlandeg
|
9f43ba839a
|
throw informative error when running the components with the wrong type of objects
|
2020-06-18 10:36:05 +02:00 |
|
svlandeg
|
6712d0b5db
|
textcat bugfix
|
2020-06-18 10:09:56 +02:00 |
|
svlandeg
|
40b2b21eef
|
small bug fix
|
2020-06-17 23:33:51 +02:00 |
|
svlandeg
|
d6c4dd6eea
|
pipe() takes docs, not examples
|
2020-06-17 21:29:36 +02:00 |
|
svlandeg
|
0f123af35e
|
ensure test keeps working with non-linked entities
|
2020-06-17 21:13:38 +02:00 |
|
svlandeg
|
6d73e139b0
|
fix entity linker
|
2020-06-17 21:12:25 +02:00 |
|
svlandeg
|
be5934b827
|
fix tagger
|
2020-06-17 19:42:11 +02:00 |
|
svlandeg
|
10d396977e
|
add support for MORPH in to/from_array, fix morphologizer overfitting test
|
2020-06-17 17:48:07 +02:00 |
|
svlandeg
|
1a151b10d6
|
correct silly typo
|
2020-06-17 14:48:14 +02:00 |
|
svlandeg
|
f6c451b650
|
cleanup
|
2020-06-17 14:45:54 +02:00 |
|
svlandeg
|
2d9f406188
|
fix test_cli
|
2020-06-17 14:42:48 +02:00 |
|
svlandeg
|
f7ad8e8c83
|
various fixes in scripts - needs to be further tested
|
2020-06-17 12:05:58 +02:00 |
|
svlandeg
|
3c4f9e4cc4
|
fix augment (needs further testing)
|
2020-06-17 10:46:29 +02:00 |
|
svlandeg
|
4ed399c848
|
minibatch utiltiy can deal with strings, docs or examples
|
2020-06-16 21:35:55 +02:00 |
|
svlandeg
|
8b66c11ff2
|
add spaces to json output format
|
2020-06-16 19:30:03 +02:00 |
|
svlandeg
|
ba80ad7efd
|
fixed some tests + WIP roundtrip unit test
|
2020-06-16 18:26:50 +02:00 |
|
svlandeg
|
43d41d6bb6
|
allow None as BILUO annotation
|
2020-06-16 15:30:05 +02:00 |
|
svlandeg
|
44a0f9c2c8
|
test_gold_biluo_different_tokenization works
|
2020-06-16 15:21:20 +02:00 |
|