Commit Graph

11780 Commits

Author SHA1 Message Date
Matthew Honnibal
e90341810c Update arc_eager oracle 2020-06-21 01:04:02 +02:00
Matthew Honnibal
c58deb3546 Work on parser oracle 2020-06-21 01:01:09 +02:00
Matthew Honnibal
914924a68b Fix mimport 2020-06-20 22:22:40 +02:00
Matthew Honnibal
2791c1c0dc Fix module name of corpus 2020-06-20 22:22:14 +02:00
Matthew Honnibal
4bbc277758 Update after removing GoldCorpus 2020-06-20 22:21:24 +02:00
Matthew Honnibal
64d00520e2 Update imports 2020-06-20 22:21:08 +02:00
Matthew Honnibal
cfd024536d Remove GoldCorpus 2020-06-20 22:13:37 +02:00
Matthew Honnibal
fd83551eb5 Skip test causing segfault 2020-06-20 22:11:27 +02:00
Matthew Honnibal
095710e40e Skip tests that cause crashes 2020-06-20 22:02:32 +02:00
Matthew Honnibal
0b23fd3891 Xfail some tests 2020-06-20 21:52:57 +02:00
Matthew Honnibal
6af99f2f2d Fix parser declaration 2020-06-20 21:50:17 +02:00
Matthew Honnibal
52edb24f07 Update header 2020-06-20 21:50:06 +02:00
Matthew Honnibal
0c10831b14 Start debugging arc_eager oracle 2020-06-20 21:49:46 +02:00
Matthew Honnibal
2bcb5881d7 Fix parser model 2020-06-20 21:49:31 +02:00
Matthew Honnibal
396dd60b3a Fix Corpus 2020-06-20 21:49:15 +02:00
Matthew Honnibal
450c6fe39c Update train.py 2020-06-20 21:49:06 +02:00
Matthew Honnibal
6d821b2e55 Make doc.from_array several times faster 2020-06-20 20:17:13 +02:00
Matthew Honnibal
fa86aa581d Allocate Doc before starting to add words 2020-06-20 20:15:21 +02:00
Matthew Honnibal
652f31d3ee Update DocBin 2020-06-20 20:12:54 +02:00
Matthew Honnibal
0a8b6631a2 Update Corpus 2020-06-20 20:12:31 +02:00
Matthew Honnibal
11fa0658f7 Work on train script 2020-06-20 20:12:19 +02:00
Matthew Honnibal
0de361cd00 Draft Corpus class for DocBin 2020-06-20 18:31:07 +02:00
Matthew Honnibal
7360d3db72 Add json2docs converter 2020-06-20 16:02:53 +02:00
Matthew Honnibal
f1756a6a22 Remove jsonl converter 2020-06-20 16:02:40 +02:00
Matthew Honnibal
5d89b1840e Update converter 2020-06-20 16:00:14 +02:00
Matthew Honnibal
f5780cb160 Serialize all attrs by default 2020-06-20 15:59:39 +02:00
Matthew Honnibal
3241acbe0b Fix import 2020-06-20 15:56:28 +02:00
Matthew Honnibal
b7a366b435 Fix compile in ArcEager 2020-06-20 15:56:16 +02:00
Matthew Honnibal
91fa2f1126 Fix docbin 2020-06-20 15:56:05 +02:00
Matthew Honnibal
476bcd4c53 Fix import 2020-06-20 15:55:57 +02:00
Matthew Honnibal
7a846921a3 Make spacy convert output docbin 2020-06-20 15:55:35 +02:00
Matthew Honnibal
0d22c6e006 Allow DocBin to take list of Doc objects. 2020-06-20 03:50:36 +02:00
Matthew Honnibal
95df028758 Update converters 2020-06-20 03:50:23 +02:00
Matthew Honnibal
3a73d95dcc Update converter to produce DocBin 2020-06-20 03:50:13 +02:00
Matthew Honnibal
d9a8fdf4b7 Fix name 2020-06-20 03:26:36 +02:00
Matthew Honnibal
e20a780867 Fix naming 2020-06-20 03:24:49 +02:00
Matthew Honnibal
f61d5e3ac3 Move things around 2020-06-20 03:23:58 +02:00
Matthew Honnibal
c630cfdb5e Move converters under spacy.gold 2020-06-20 03:20:34 +02:00
Matthew Honnibal
161d8439fa Start updating converters 2020-06-20 03:19:40 +02:00
Matthew Honnibal
a79f0598a6 Merge branch 'whatif/arrow' of https://github.com/explosion/spaCy into whatif/arrow 2020-06-20 02:36:40 +02:00
Matthew Honnibal
be81577719 Fix oracles 2020-06-20 02:36:12 +02:00
svlandeg
e30ec9b2a8 fix test checking for variants 2020-06-19 14:05:35 +02:00
svlandeg
25b0674320 clean up 2020-06-19 11:31:01 +02:00
svlandeg
c705a28438 add links to to_dict 2020-06-19 11:22:24 +02:00
Matthew Honnibal
03db143cd0 Draft new GoldCorpus class 2020-06-19 04:15:02 +02:00
Matthew Honnibal
a389866df6 Merge branch 'whatif/arrow' of https://github.com/explosion/spaCy into whatif/arrow 2020-06-19 02:30:27 +02:00
Matthew Honnibal
bd29b7b14f Update parser and NER gold stuff 2020-06-19 02:29:16 +02:00
Matthew Honnibal
5ae9e3480d Return ArcEagerGoldParse from ArcEager 2020-06-19 00:11:59 +02:00
svlandeg
6ca6d7d6b4 test for split sentences with various alignment issues, works 2020-06-18 20:01:02 +02:00
svlandeg
1951921230 implement split_sent with aligned SENT_START attribute 2020-06-18 19:41:53 +02:00