Matthew Honnibal
fd83551eb5
Skip test causing segfault
2020-06-20 22:11:27 +02:00
Matthew Honnibal
095710e40e
Skip tests that cause crashes
2020-06-20 22:02:32 +02:00
Matthew Honnibal
0b23fd3891
Xfail some tests
2020-06-20 21:52:57 +02:00
Matthew Honnibal
6af99f2f2d
Fix parser declaration
2020-06-20 21:50:17 +02:00
Matthew Honnibal
52edb24f07
Update header
2020-06-20 21:50:06 +02:00
Matthew Honnibal
0c10831b14
Start debugging arc_eager oracle
2020-06-20 21:49:46 +02:00
Matthew Honnibal
2bcb5881d7
Fix parser model
2020-06-20 21:49:31 +02:00
Matthew Honnibal
396dd60b3a
Fix Corpus
2020-06-20 21:49:15 +02:00
Matthew Honnibal
450c6fe39c
Update train.py
2020-06-20 21:49:06 +02:00
Matthew Honnibal
6d821b2e55
Make doc.from_array several times faster
2020-06-20 20:17:13 +02:00
Matthew Honnibal
fa86aa581d
Allocate Doc before starting to add words
2020-06-20 20:15:21 +02:00
Matthew Honnibal
652f31d3ee
Update DocBin
2020-06-20 20:12:54 +02:00
Matthew Honnibal
0a8b6631a2
Update Corpus
2020-06-20 20:12:31 +02:00
Matthew Honnibal
11fa0658f7
Work on train script
2020-06-20 20:12:19 +02:00
Matthew Honnibal
0de361cd00
Draft Corpus class for DocBin
2020-06-20 18:31:07 +02:00
Matthew Honnibal
7360d3db72
Add json2docs converter
2020-06-20 16:02:53 +02:00
Matthew Honnibal
f1756a6a22
Remove jsonl converter
2020-06-20 16:02:40 +02:00
Matthew Honnibal
5d89b1840e
Update converter
2020-06-20 16:00:14 +02:00
Matthew Honnibal
f5780cb160
Serialize all attrs by default
2020-06-20 15:59:39 +02:00
Matthew Honnibal
3241acbe0b
Fix import
2020-06-20 15:56:28 +02:00
Matthew Honnibal
b7a366b435
Fix compile in ArcEager
2020-06-20 15:56:16 +02:00
Matthew Honnibal
91fa2f1126
Fix docbin
2020-06-20 15:56:05 +02:00
Matthew Honnibal
476bcd4c53
Fix import
2020-06-20 15:55:57 +02:00
Matthew Honnibal
7a846921a3
Make spacy convert output docbin
2020-06-20 15:55:35 +02:00
Matthew Honnibal
0d22c6e006
Allow DocBin to take list of Doc objects.
2020-06-20 03:50:36 +02:00
Matthew Honnibal
95df028758
Update converters
2020-06-20 03:50:23 +02:00
Matthew Honnibal
3a73d95dcc
Update converter to produce DocBin
2020-06-20 03:50:13 +02:00
Matthew Honnibal
d9a8fdf4b7
Fix name
2020-06-20 03:26:36 +02:00
Matthew Honnibal
e20a780867
Fix naming
2020-06-20 03:24:49 +02:00
Matthew Honnibal
f61d5e3ac3
Move things around
2020-06-20 03:23:58 +02:00
Matthew Honnibal
c630cfdb5e
Move converters under spacy.gold
2020-06-20 03:20:34 +02:00
Matthew Honnibal
161d8439fa
Start updating converters
2020-06-20 03:19:40 +02:00
Matthew Honnibal
a79f0598a6
Merge branch 'whatif/arrow' of https://github.com/explosion/spaCy into whatif/arrow
2020-06-20 02:36:40 +02:00
Matthew Honnibal
be81577719
Fix oracles
2020-06-20 02:36:12 +02:00
svlandeg
e30ec9b2a8
fix test checking for variants
2020-06-19 14:05:35 +02:00
svlandeg
25b0674320
clean up
2020-06-19 11:31:01 +02:00
svlandeg
c705a28438
add links to to_dict
2020-06-19 11:22:24 +02:00
Matthew Honnibal
03db143cd0
Draft new GoldCorpus class
2020-06-19 04:15:02 +02:00
Matthew Honnibal
a389866df6
Merge branch 'whatif/arrow' of https://github.com/explosion/spaCy into whatif/arrow
2020-06-19 02:30:27 +02:00
Matthew Honnibal
bd29b7b14f
Update parser and NER gold stuff
2020-06-19 02:29:16 +02:00
Matthew Honnibal
5ae9e3480d
Return ArcEagerGoldParse from ArcEager
2020-06-19 00:11:59 +02:00
svlandeg
6ca6d7d6b4
test for split sentences with various alignment issues, works
2020-06-18 20:01:02 +02:00
svlandeg
1951921230
implement split_sent with aligned SENT_START attribute
2020-06-18 19:41:53 +02:00
svlandeg
d1d6f16776
fix the fix
2020-06-18 19:15:32 +02:00
svlandeg
e822367cf7
prevent writing dummy values like deps because that could interfer with sent_start values
2020-06-18 17:47:59 +02:00
svlandeg
0b6d45eae1
various small fixes
2020-06-18 15:55:00 +02:00
svlandeg
1c71f2310c
fix renames and simple_ner labels
2020-06-18 15:33:28 +02:00
svlandeg
64fc840a5d
bugfix tok2vec
2020-06-18 15:24:40 +02:00
svlandeg
01f9ae774c
small fixes
2020-06-18 14:01:19 +02:00
svlandeg
0c6f1f3891
fix BiluoPushDown parsing entities
2020-06-18 13:00:03 +02:00
svlandeg
cd790aaa2a
fix parser tests to work with example (most still failing)
2020-06-18 11:19:22 +02:00
svlandeg
9f43ba839a
throw informative error when running the components with the wrong type of objects
2020-06-18 10:36:05 +02:00
svlandeg
6712d0b5db
textcat bugfix
2020-06-18 10:09:56 +02:00
svlandeg
40b2b21eef
small bug fix
2020-06-17 23:33:51 +02:00
svlandeg
d6c4dd6eea
pipe() takes docs, not examples
2020-06-17 21:29:36 +02:00
svlandeg
0f123af35e
ensure test keeps working with non-linked entities
2020-06-17 21:13:38 +02:00
svlandeg
6d73e139b0
fix entity linker
2020-06-17 21:12:25 +02:00
svlandeg
be5934b827
fix tagger
2020-06-17 19:42:11 +02:00
svlandeg
10d396977e
add support for MORPH in to/from_array, fix morphologizer overfitting test
2020-06-17 17:48:07 +02:00
svlandeg
1a151b10d6
correct silly typo
2020-06-17 14:48:14 +02:00
svlandeg
f6c451b650
cleanup
2020-06-17 14:45:54 +02:00
svlandeg
2d9f406188
fix test_cli
2020-06-17 14:42:48 +02:00
svlandeg
f7ad8e8c83
various fixes in scripts - needs to be further tested
2020-06-17 12:05:58 +02:00
svlandeg
3c4f9e4cc4
fix augment (needs further testing)
2020-06-17 10:46:29 +02:00
svlandeg
4ed399c848
minibatch utiltiy can deal with strings, docs or examples
2020-06-16 21:35:55 +02:00
svlandeg
8b66c11ff2
add spaces to json output format
2020-06-16 19:30:03 +02:00
svlandeg
ba80ad7efd
fixed some tests + WIP roundtrip unit test
2020-06-16 18:26:50 +02:00
svlandeg
43d41d6bb6
allow None as BILUO annotation
2020-06-16 15:30:05 +02:00
svlandeg
44a0f9c2c8
test_gold_biluo_different_tokenization works
2020-06-16 15:21:20 +02:00
svlandeg
1c35b8efcd
fix spaces
2020-06-16 12:08:25 +02:00
svlandeg
6fea5fa4bd
attempt to fix cases with weird spaces
2020-06-16 11:52:29 +02:00
svlandeg
0702a1d3fb
fix test for misaligned
2020-06-15 23:10:47 +02:00
svlandeg
a28f8f369e
Fix many-to-one IOB codes
2020-06-15 23:06:22 +02:00
svlandeg
12886b787b
fixing NER one-to-many alignment
2020-06-15 22:44:17 +02:00
Matthew Honnibal
a0bf73a5dd
Merge branch 'whatif/arrow' of https://github.com/explosion/spaCy into whatif/arrow
2020-06-15 18:16:01 +02:00
Matthew Honnibal
c66f93299e
Remove TokenAnnotation code from nonproj
2020-06-15 18:14:47 +02:00
Matthew Honnibal
c95494739c
Fix import
2020-06-15 18:11:10 +02:00
Matthew Honnibal
8f978f2031
Fix import
2020-06-15 18:10:47 +02:00
Matthew Honnibal
95de7efaad
Draft create_gold_state for arc_eager oracle
2020-06-15 18:10:19 +02:00
svlandeg
68986a252e
additional tests for new get_aligned function
2020-06-15 17:42:40 +02:00
svlandeg
41d29983a7
start testing get_aligned
2020-06-15 17:16:01 +02:00
svlandeg
fd5f199feb
fixing language and scoring tests
2020-06-15 15:02:05 +02:00
svlandeg
b4d914ec77
fix error catching
2020-06-15 12:56:32 +02:00
svlandeg
b9c9cbb2cd
informative error when calling to_array with wrong field
2020-06-15 11:53:31 +02:00
svlandeg
ff231e1cdd
fix merge conflict
2020-06-15 09:04:19 +02:00
svlandeg
a48553c1ed
fix error numbers
2020-06-15 08:51:31 +02:00
Matthew Honnibal
3c0fc10dc4
Remove beam for now (maybe)
...
Remove beam_utils
Update setup.py
Remove beam
2020-06-14 19:53:29 +02:00
Matthew Honnibal
98ca14f577
Remove GoldParse
...
WIP on removing goldparse
Get ArcEager compiling after GoldParse excise
Update setup.py
Get spacy.syntax compiling after removing GoldParse
Rename NewExample -> Example and clean up
Clean html files
Start updating tests
Update Morphologizer
2020-06-14 19:53:30 +02:00
Matthew Honnibal
d53723aa4f
Merge from whatif/arrow
2020-06-14 17:43:59 +02:00
Matthew Honnibal
380cce9d8b
Update errors
2020-06-14 17:40:05 +02:00
Matthew Honnibal
706e652820
Merge from develop
2020-06-14 17:35:01 +02:00
Matthew Honnibal
9296d71a54
More GoldParse excise
2020-06-14 17:26:54 +02:00
Matthew Honnibal
60d4e5a9e0
WIP on updating transition-system
2020-06-14 17:22:14 +02:00
Matthew Honnibal
7d65615625
WIP start excising GoldParse
2020-06-14 17:11:41 +02:00
Matthew Honnibal
4362ec7084
Hack Language.evaluate
2020-06-13 23:37:42 +02:00
Matthew Honnibal
7de997c0a5
Update test
2020-06-13 23:11:45 +02:00
Matthew Honnibal
8f941ef527
Update GoldParse
2020-06-13 23:11:29 +02:00
Matthew Honnibal
3a0bbcfb4c
Add biluo_tags_from_doc function
2020-06-13 23:10:54 +02:00
Matthew Honnibal
caa7508725
Draft missing NewExample stuff
2020-06-13 23:10:21 +02:00
Matthew Honnibal
3eb8f3867e
Update test
2020-06-13 23:05:16 +02:00