svlandeg
43d41d6bb6
allow None as BILUO annotation
2020-06-16 15:30:05 +02:00
svlandeg
44a0f9c2c8
test_gold_biluo_different_tokenization works
2020-06-16 15:21:20 +02:00
svlandeg
1c35b8efcd
fix spaces
2020-06-16 12:08:25 +02:00
svlandeg
6fea5fa4bd
attempt to fix cases with weird spaces
2020-06-16 11:52:29 +02:00
svlandeg
0702a1d3fb
fix test for misaligned
2020-06-15 23:10:47 +02:00
svlandeg
a28f8f369e
Fix many-to-one IOB codes
2020-06-15 23:06:22 +02:00
svlandeg
12886b787b
fixing NER one-to-many alignment
2020-06-15 22:44:17 +02:00
Matthew Honnibal
a0bf73a5dd
Merge branch 'whatif/arrow' of https://github.com/explosion/spaCy into whatif/arrow
2020-06-15 18:16:01 +02:00
Matthew Honnibal
c66f93299e
Remove TokenAnnotation code from nonproj
2020-06-15 18:14:47 +02:00
Matthew Honnibal
c95494739c
Fix import
2020-06-15 18:11:10 +02:00
Matthew Honnibal
8f978f2031
Fix import
2020-06-15 18:10:47 +02:00
Matthew Honnibal
95de7efaad
Draft create_gold_state for arc_eager oracle
2020-06-15 18:10:19 +02:00
svlandeg
68986a252e
additional tests for new get_aligned function
2020-06-15 17:42:40 +02:00
svlandeg
41d29983a7
start testing get_aligned
2020-06-15 17:16:01 +02:00
svlandeg
fd5f199feb
fixing language and scoring tests
2020-06-15 15:02:05 +02:00
svlandeg
b4d914ec77
fix error catching
2020-06-15 12:56:32 +02:00
svlandeg
b9c9cbb2cd
informative error when calling to_array with wrong field
2020-06-15 11:53:31 +02:00
svlandeg
ff231e1cdd
fix merge conflict
2020-06-15 09:04:19 +02:00
svlandeg
a48553c1ed
fix error numbers
2020-06-15 08:51:31 +02:00
Matthew Honnibal
3c0fc10dc4
Remove beam for now (maybe)
...
Remove beam_utils
Update setup.py
Remove beam
2020-06-14 19:53:29 +02:00
Matthew Honnibal
98ca14f577
Remove GoldParse
...
WIP on removing goldparse
Get ArcEager compiling after GoldParse excise
Update setup.py
Get spacy.syntax compiling after removing GoldParse
Rename NewExample -> Example and clean up
Clean html files
Start updating tests
Update Morphologizer
2020-06-14 19:53:30 +02:00
Matthew Honnibal
d53723aa4f
Merge from whatif/arrow
2020-06-14 17:43:59 +02:00
Matthew Honnibal
380cce9d8b
Update errors
2020-06-14 17:40:05 +02:00
Matthew Honnibal
706e652820
Merge from develop
2020-06-14 17:35:01 +02:00
Matthew Honnibal
9296d71a54
More GoldParse excise
2020-06-14 17:26:54 +02:00
Matthew Honnibal
60d4e5a9e0
WIP on updating transition-system
2020-06-14 17:22:14 +02:00
Matthew Honnibal
7d65615625
WIP start excising GoldParse
2020-06-14 17:11:41 +02:00
Matthew Honnibal
4362ec7084
Hack Language.evaluate
2020-06-13 23:37:42 +02:00
Matthew Honnibal
7de997c0a5
Update test
2020-06-13 23:11:45 +02:00
Matthew Honnibal
8f941ef527
Update GoldParse
2020-06-13 23:11:29 +02:00
Matthew Honnibal
3a0bbcfb4c
Add biluo_tags_from_doc function
2020-06-13 23:10:54 +02:00
Matthew Honnibal
caa7508725
Draft missing NewExample stuff
2020-06-13 23:10:21 +02:00
Matthew Honnibal
3eb8f3867e
Update test
2020-06-13 23:05:16 +02:00
Matthew Honnibal
5564314d32
Suggest approach for GoldParse
2020-06-13 15:43:35 +02:00
Matthew Honnibal
b078b05ecd
Handle various data better in NewExample
2020-06-13 15:30:12 +02:00
svlandeg
face0de74f
fix MORPH conversion + enable unit test
2020-06-12 16:29:09 +02:00
svlandeg
a5ee082da1
cats bugfix
2020-06-12 15:49:38 +02:00
svlandeg
880dccf93e
entities on doc_annotation, parse links and check their offsets against the entities. unit test works
2020-06-12 15:47:20 +02:00
svlandeg
3aed177a35
fix ENT_IOB conversion and enable unit test
2020-06-12 11:30:24 +02:00
Matthew Honnibal
a1c5b694be
Small fixes to train defaults
2020-06-12 02:22:13 +02:00
Sofie Van Landeghem
c0f4a1e43b
train is from-config by default ( #5575 )
...
* verbose and tag_map options
* adding init_tok2vec option and only changing the tok2vec that is specified
* adding omit_extra_lookups and verifying textcat config
* wip
* pretrain bugfix
* add replace and resume options
* train_textcat fix
* raw text functionality
* improve UX when KeyError or when input data can't be parsed
* avoid unnecessary access to goldparse in TextCat pipe
* save performance information in nlp.meta
* add noise_level to config
* move nn_parser's defaults to config file
* multitask in config - doesn't work yet
* scorer offering both F and AUC options, need to be specified in config
* add textcat verification code from old train script
* small fixes to config files
* clean up
* set default config for ner/parser to allow create_pipe to work as before
* two more test fixes
* small fixes
* cleanup
* fix NER pickling + additional unit test
* create_pipe as before
2020-06-12 02:02:07 +02:00
svlandeg
6a67a11682
adding tests for new example class (some still failing - WIP)
2020-06-11 17:43:40 +02:00
Matthew Honnibal
488727aee0
Start updating test
2020-06-09 23:58:28 +02:00
Matthew Honnibal
337d2b5ad6
Fix sent start in NewExample
2020-06-09 23:58:16 +02:00
Matthew Honnibal
ad547a4b8f
Refactor towards new Example class
2020-06-09 23:39:46 +02:00
Matthew Honnibal
82810b9846
Update morphologizer
2020-06-09 23:32:07 +02:00
Matthew Honnibal
af1b5f129b
Use new example class in GoldCorpus
2020-06-09 23:31:19 +02:00
Matthew Honnibal
0714f1fa5c
Remove the 'pass example into __call__' thing
2020-06-09 23:30:06 +02:00
Matthew Honnibal
b3868cd1f8
Update NewExample
2020-06-09 23:06:48 +02:00
Matthew Honnibal
ccd332a9fc
Update test stubs
2020-06-09 15:49:04 +02:00