Matthew Honnibal
|
e2279eab1c
|
Make doc.from_array several times faster
|
2020-06-22 00:46:08 +02:00 |
|
Matthew Honnibal
|
de32515bf8
|
Allocate Doc before starting to add words
|
2020-06-22 00:46:08 +02:00 |
|
Matthew Honnibal
|
6670c44390
|
Unskip tests
|
2020-06-21 01:17:52 +02:00 |
|
Matthew Honnibal
|
90d9f04e0b
|
Unskip
|
2020-06-21 01:16:33 +02:00 |
|
Matthew Honnibal
|
2b180ea033
|
Update test
|
2020-06-21 01:15:41 +02:00 |
|
Matthew Honnibal
|
192b94f0a1
|
Remove beam test
|
2020-06-21 01:15:12 +02:00 |
|
Matthew Honnibal
|
9db66ddd48
|
Update test_arc_eager_oracle
|
2020-06-21 01:12:28 +02:00 |
|
Matthew Honnibal
|
7544c21f5b
|
Update transition system
|
2020-06-21 01:12:05 +02:00 |
|
Matthew Honnibal
|
318a046fb0
|
Restore ArcEager.get_cost function
|
2020-06-21 01:11:08 +02:00 |
|
Matthew Honnibal
|
e90341810c
|
Update arc_eager oracle
|
2020-06-21 01:04:02 +02:00 |
|
Matthew Honnibal
|
c58deb3546
|
Work on parser oracle
|
2020-06-21 01:01:09 +02:00 |
|
Matthew Honnibal
|
914924a68b
|
Fix mimport
|
2020-06-20 22:22:40 +02:00 |
|
Matthew Honnibal
|
2791c1c0dc
|
Fix module name of corpus
|
2020-06-20 22:22:14 +02:00 |
|
Matthew Honnibal
|
4bbc277758
|
Update after removing GoldCorpus
|
2020-06-20 22:21:24 +02:00 |
|
Matthew Honnibal
|
64d00520e2
|
Update imports
|
2020-06-20 22:21:08 +02:00 |
|
Matthew Honnibal
|
cfd024536d
|
Remove GoldCorpus
|
2020-06-20 22:13:37 +02:00 |
|
Matthew Honnibal
|
fd83551eb5
|
Skip test causing segfault
|
2020-06-20 22:11:27 +02:00 |
|
Matthew Honnibal
|
095710e40e
|
Skip tests that cause crashes
|
2020-06-20 22:02:32 +02:00 |
|
Matthew Honnibal
|
0b23fd3891
|
Xfail some tests
|
2020-06-20 21:52:57 +02:00 |
|
Matthew Honnibal
|
6af99f2f2d
|
Fix parser declaration
|
2020-06-20 21:50:17 +02:00 |
|
Matthew Honnibal
|
52edb24f07
|
Update header
|
2020-06-20 21:50:06 +02:00 |
|
Matthew Honnibal
|
0c10831b14
|
Start debugging arc_eager oracle
|
2020-06-20 21:49:46 +02:00 |
|
Matthew Honnibal
|
2bcb5881d7
|
Fix parser model
|
2020-06-20 21:49:31 +02:00 |
|
Matthew Honnibal
|
396dd60b3a
|
Fix Corpus
|
2020-06-20 21:49:15 +02:00 |
|
Matthew Honnibal
|
450c6fe39c
|
Update train.py
|
2020-06-20 21:49:06 +02:00 |
|
Matthew Honnibal
|
6d821b2e55
|
Make doc.from_array several times faster
|
2020-06-20 20:17:13 +02:00 |
|
Matthew Honnibal
|
fa86aa581d
|
Allocate Doc before starting to add words
|
2020-06-20 20:15:21 +02:00 |
|
Matthew Honnibal
|
652f31d3ee
|
Update DocBin
|
2020-06-20 20:12:54 +02:00 |
|
Matthew Honnibal
|
0a8b6631a2
|
Update Corpus
|
2020-06-20 20:12:31 +02:00 |
|
Matthew Honnibal
|
11fa0658f7
|
Work on train script
|
2020-06-20 20:12:19 +02:00 |
|
Matthew Honnibal
|
0de361cd00
|
Draft Corpus class for DocBin
|
2020-06-20 18:31:07 +02:00 |
|
Matthew Honnibal
|
7360d3db72
|
Add json2docs converter
|
2020-06-20 16:02:53 +02:00 |
|
Matthew Honnibal
|
f1756a6a22
|
Remove jsonl converter
|
2020-06-20 16:02:40 +02:00 |
|
Matthew Honnibal
|
5d89b1840e
|
Update converter
|
2020-06-20 16:00:14 +02:00 |
|
Matthew Honnibal
|
f5780cb160
|
Serialize all attrs by default
|
2020-06-20 15:59:39 +02:00 |
|
Matthew Honnibal
|
3241acbe0b
|
Fix import
|
2020-06-20 15:56:28 +02:00 |
|
Matthew Honnibal
|
b7a366b435
|
Fix compile in ArcEager
|
2020-06-20 15:56:16 +02:00 |
|
Matthew Honnibal
|
91fa2f1126
|
Fix docbin
|
2020-06-20 15:56:05 +02:00 |
|
Matthew Honnibal
|
476bcd4c53
|
Fix import
|
2020-06-20 15:55:57 +02:00 |
|
Matthew Honnibal
|
7a846921a3
|
Make spacy convert output docbin
|
2020-06-20 15:55:35 +02:00 |
|
Matthew Honnibal
|
0d22c6e006
|
Allow DocBin to take list of Doc objects.
|
2020-06-20 03:50:36 +02:00 |
|
Matthew Honnibal
|
95df028758
|
Update converters
|
2020-06-20 03:50:23 +02:00 |
|
Matthew Honnibal
|
3a73d95dcc
|
Update converter to produce DocBin
|
2020-06-20 03:50:13 +02:00 |
|
Matthew Honnibal
|
d9a8fdf4b7
|
Fix name
|
2020-06-20 03:26:36 +02:00 |
|
Matthew Honnibal
|
e20a780867
|
Fix naming
|
2020-06-20 03:24:49 +02:00 |
|
Matthew Honnibal
|
f61d5e3ac3
|
Move things around
|
2020-06-20 03:23:58 +02:00 |
|
Matthew Honnibal
|
c630cfdb5e
|
Move converters under spacy.gold
|
2020-06-20 03:20:34 +02:00 |
|
Matthew Honnibal
|
161d8439fa
|
Start updating converters
|
2020-06-20 03:19:40 +02:00 |
|
Matthew Honnibal
|
a79f0598a6
|
Merge branch 'whatif/arrow' of https://github.com/explosion/spaCy into whatif/arrow
|
2020-06-20 02:36:40 +02:00 |
|
Matthew Honnibal
|
be81577719
|
Fix oracles
|
2020-06-20 02:36:12 +02:00 |
|
svlandeg
|
e30ec9b2a8
|
fix test checking for variants
|
2020-06-19 14:05:35 +02:00 |
|
svlandeg
|
25b0674320
|
clean up
|
2020-06-19 11:31:01 +02:00 |
|
svlandeg
|
c705a28438
|
add links to to_dict
|
2020-06-19 11:22:24 +02:00 |
|
Matthew Honnibal
|
03db143cd0
|
Draft new GoldCorpus class
|
2020-06-19 04:15:02 +02:00 |
|
Matthew Honnibal
|
a389866df6
|
Merge branch 'whatif/arrow' of https://github.com/explosion/spaCy into whatif/arrow
|
2020-06-19 02:30:27 +02:00 |
|
Matthew Honnibal
|
bd29b7b14f
|
Update parser and NER gold stuff
|
2020-06-19 02:29:16 +02:00 |
|
Matthew Honnibal
|
5ae9e3480d
|
Return ArcEagerGoldParse from ArcEager
|
2020-06-19 00:11:59 +02:00 |
|
svlandeg
|
6ca6d7d6b4
|
test for split sentences with various alignment issues, works
|
2020-06-18 20:01:02 +02:00 |
|
svlandeg
|
1951921230
|
implement split_sent with aligned SENT_START attribute
|
2020-06-18 19:41:53 +02:00 |
|
svlandeg
|
d1d6f16776
|
fix the fix
|
2020-06-18 19:15:32 +02:00 |
|
svlandeg
|
e822367cf7
|
prevent writing dummy values like deps because that could interfer with sent_start values
|
2020-06-18 17:47:59 +02:00 |
|
svlandeg
|
0b6d45eae1
|
various small fixes
|
2020-06-18 15:55:00 +02:00 |
|
svlandeg
|
1c71f2310c
|
fix renames and simple_ner labels
|
2020-06-18 15:33:28 +02:00 |
|
svlandeg
|
64fc840a5d
|
bugfix tok2vec
|
2020-06-18 15:24:40 +02:00 |
|
svlandeg
|
01f9ae774c
|
small fixes
|
2020-06-18 14:01:19 +02:00 |
|
svlandeg
|
0c6f1f3891
|
fix BiluoPushDown parsing entities
|
2020-06-18 13:00:03 +02:00 |
|
svlandeg
|
cd790aaa2a
|
fix parser tests to work with example (most still failing)
|
2020-06-18 11:19:22 +02:00 |
|
svlandeg
|
9f43ba839a
|
throw informative error when running the components with the wrong type of objects
|
2020-06-18 10:36:05 +02:00 |
|
svlandeg
|
6712d0b5db
|
textcat bugfix
|
2020-06-18 10:09:56 +02:00 |
|
svlandeg
|
40b2b21eef
|
small bug fix
|
2020-06-17 23:33:51 +02:00 |
|
svlandeg
|
d6c4dd6eea
|
pipe() takes docs, not examples
|
2020-06-17 21:29:36 +02:00 |
|
svlandeg
|
0f123af35e
|
ensure test keeps working with non-linked entities
|
2020-06-17 21:13:38 +02:00 |
|
svlandeg
|
6d73e139b0
|
fix entity linker
|
2020-06-17 21:12:25 +02:00 |
|
svlandeg
|
be5934b827
|
fix tagger
|
2020-06-17 19:42:11 +02:00 |
|
svlandeg
|
10d396977e
|
add support for MORPH in to/from_array, fix morphologizer overfitting test
|
2020-06-17 17:48:07 +02:00 |
|
svlandeg
|
1a151b10d6
|
correct silly typo
|
2020-06-17 14:48:14 +02:00 |
|
svlandeg
|
f6c451b650
|
cleanup
|
2020-06-17 14:45:54 +02:00 |
|
svlandeg
|
2d9f406188
|
fix test_cli
|
2020-06-17 14:42:48 +02:00 |
|
svlandeg
|
f7ad8e8c83
|
various fixes in scripts - needs to be further tested
|
2020-06-17 12:05:58 +02:00 |
|
svlandeg
|
3c4f9e4cc4
|
fix augment (needs further testing)
|
2020-06-17 10:46:29 +02:00 |
|
svlandeg
|
4ed399c848
|
minibatch utiltiy can deal with strings, docs or examples
|
2020-06-16 21:35:55 +02:00 |
|
svlandeg
|
8b66c11ff2
|
add spaces to json output format
|
2020-06-16 19:30:03 +02:00 |
|
svlandeg
|
ba80ad7efd
|
fixed some tests + WIP roundtrip unit test
|
2020-06-16 18:26:50 +02:00 |
|
svlandeg
|
43d41d6bb6
|
allow None as BILUO annotation
|
2020-06-16 15:30:05 +02:00 |
|
svlandeg
|
44a0f9c2c8
|
test_gold_biluo_different_tokenization works
|
2020-06-16 15:21:20 +02:00 |
|
svlandeg
|
1c35b8efcd
|
fix spaces
|
2020-06-16 12:08:25 +02:00 |
|
svlandeg
|
6fea5fa4bd
|
attempt to fix cases with weird spaces
|
2020-06-16 11:52:29 +02:00 |
|
svlandeg
|
0702a1d3fb
|
fix test for misaligned
|
2020-06-15 23:10:47 +02:00 |
|
svlandeg
|
a28f8f369e
|
Fix many-to-one IOB codes
|
2020-06-15 23:06:22 +02:00 |
|
svlandeg
|
12886b787b
|
fixing NER one-to-many alignment
|
2020-06-15 22:44:17 +02:00 |
|
Matthew Honnibal
|
a0bf73a5dd
|
Merge branch 'whatif/arrow' of https://github.com/explosion/spaCy into whatif/arrow
|
2020-06-15 18:16:01 +02:00 |
|
Matthew Honnibal
|
c66f93299e
|
Remove TokenAnnotation code from nonproj
|
2020-06-15 18:14:47 +02:00 |
|
Matthew Honnibal
|
c95494739c
|
Fix import
|
2020-06-15 18:11:10 +02:00 |
|
Matthew Honnibal
|
8f978f2031
|
Fix import
|
2020-06-15 18:10:47 +02:00 |
|
Matthew Honnibal
|
95de7efaad
|
Draft create_gold_state for arc_eager oracle
|
2020-06-15 18:10:19 +02:00 |
|
svlandeg
|
68986a252e
|
additional tests for new get_aligned function
|
2020-06-15 17:42:40 +02:00 |
|
svlandeg
|
41d29983a7
|
start testing get_aligned
|
2020-06-15 17:16:01 +02:00 |
|
svlandeg
|
fd5f199feb
|
fixing language and scoring tests
|
2020-06-15 15:02:05 +02:00 |
|
svlandeg
|
b4d914ec77
|
fix error catching
|
2020-06-15 12:56:32 +02:00 |
|
svlandeg
|
b9c9cbb2cd
|
informative error when calling to_array with wrong field
|
2020-06-15 11:53:31 +02:00 |
|