Matthew Honnibal
60d4e5a9e0
WIP on updating transition-system
2020-06-14 17:22:14 +02:00
Matthew Honnibal
7d65615625
WIP start excising GoldParse
2020-06-14 17:11:41 +02:00
Matthew Honnibal
4362ec7084
Hack Language.evaluate
2020-06-13 23:37:42 +02:00
Matthew Honnibal
7de997c0a5
Update test
2020-06-13 23:11:45 +02:00
Matthew Honnibal
8f941ef527
Update GoldParse
2020-06-13 23:11:29 +02:00
Matthew Honnibal
3a0bbcfb4c
Add biluo_tags_from_doc function
2020-06-13 23:10:54 +02:00
Matthew Honnibal
caa7508725
Draft missing NewExample stuff
2020-06-13 23:10:21 +02:00
Matthew Honnibal
3eb8f3867e
Update test
2020-06-13 23:05:16 +02:00
Arvind Srinivasan
aa5b40fa64
Added Tamil Example Sentences ( #5583 )
...
* Added Examples for Tamil Sentences
#### Description
This PR add example sentences for the Tamil language which were missing as per issue #1107
#### Type of Change
This is an enhancement.
* Accepting spaCy Contributor Agreement
* Signed on my behalf as an individual
2020-06-13 15:56:26 +02:00
Matthew Honnibal
5564314d32
Suggest approach for GoldParse
2020-06-13 15:43:35 +02:00
Matthew Honnibal
b078b05ecd
Handle various data better in NewExample
2020-06-13 15:30:12 +02:00
svlandeg
face0de74f
fix MORPH conversion + enable unit test
2020-06-12 16:29:09 +02:00
svlandeg
a5ee082da1
cats bugfix
2020-06-12 15:49:38 +02:00
svlandeg
880dccf93e
entities on doc_annotation, parse links and check their offsets against the entities. unit test works
2020-06-12 15:47:20 +02:00
theudas
3f5e2f9d99
Added Parameter to NEL to take n sentences into account ( #5548 )
...
* added setting for neighbour sentence in NEL
* added spaCy contributor agreement
* added multi sentence also for training
* made the try-except block smaller
2020-06-12 15:15:03 +02:00
adrianeboyd
4724fa4cf4
Expand Japanese requirements warning ( #5572 )
...
Include explicit install instructions in Japanese requirements warning.
2020-06-12 15:14:55 +02:00
adrianeboyd
44967a3f9c
Update pytest conf for sudachipy with Japanese ( #5574 )
2020-06-12 15:14:47 +02:00
svlandeg
3aed177a35
fix ENT_IOB conversion and enable unit test
2020-06-12 11:30:24 +02:00
Matthew Honnibal
a1c5b694be
Small fixes to train defaults
2020-06-12 02:22:13 +02:00
theudas
fa46e0bef2
Added Parameter to NEL to take n sentences into account ( #5548 )
...
* added setting for neighbour sentence in NEL
* added spaCy contributor agreement
* added multi sentence also for training
* made the try-except block smaller
2020-06-12 02:03:23 +02:00
Sofie Van Landeghem
c0f4a1e43b
train is from-config by default ( #5575 )
...
* verbose and tag_map options
* adding init_tok2vec option and only changing the tok2vec that is specified
* adding omit_extra_lookups and verifying textcat config
* wip
* pretrain bugfix
* add replace and resume options
* train_textcat fix
* raw text functionality
* improve UX when KeyError or when input data can't be parsed
* avoid unnecessary access to goldparse in TextCat pipe
* save performance information in nlp.meta
* add noise_level to config
* move nn_parser's defaults to config file
* multitask in config - doesn't work yet
* scorer offering both F and AUC options, need to be specified in config
* add textcat verification code from old train script
* small fixes to config files
* clean up
* set default config for ner/parser to allow create_pipe to work as before
* two more test fixes
* small fixes
* cleanup
* fix NER pickling + additional unit test
* create_pipe as before
2020-06-12 02:02:07 +02:00
svlandeg
6a67a11682
adding tests for new example class (some still failing - WIP)
2020-06-11 17:43:40 +02:00
Sofie Van Landeghem
18c6dc8093
removing label both on comment and on close
2020-06-11 14:09:40 +02:00
adrianeboyd
556895177e
Expand Japanese requirements warning ( #5572 )
...
Include explicit install instructions in Japanese requirements warning.
2020-06-11 13:47:37 +02:00
adrianeboyd
fe167fcf7d
Update pytest conf for sudachipy with Japanese ( #5574 )
2020-06-11 10:23:50 +02:00
Jones Martins
bab30e4ad2
Add "c'mon" token exception ( #5570 )
...
* Add "c'mon" exception
* Fix typo in "C'mon" exception
2020-06-10 21:54:06 +02:00
Jones Martins
28db7dd5d9
Add missing pronoums/determiners ( #5569 )
...
* Add missing pronoums/determiners
* Add test for missing pronoums
* Add contributor file
2020-06-10 18:47:04 +02:00
Sofie Van Landeghem
12c1965070
set delay to 7 days
2020-06-10 10:46:12 +02:00
Matthew Honnibal
488727aee0
Start updating test
2020-06-09 23:58:28 +02:00
Matthew Honnibal
337d2b5ad6
Fix sent start in NewExample
2020-06-09 23:58:16 +02:00
Matthew Honnibal
ad547a4b8f
Refactor towards new Example class
2020-06-09 23:39:46 +02:00
Matthew Honnibal
82810b9846
Update morphologizer
2020-06-09 23:32:07 +02:00
Matthew Honnibal
af1b5f129b
Use new example class in GoldCorpus
2020-06-09 23:31:19 +02:00
Matthew Honnibal
0714f1fa5c
Remove the 'pass example into __call__' thing
2020-06-09 23:30:06 +02:00
Matthew Honnibal
b3868cd1f8
Update NewExample
2020-06-09 23:06:48 +02:00
Matthew Honnibal
ccd332a9fc
Update test stubs
2020-06-09 15:49:04 +02:00
adrianeboyd
0a70bd6281
Bump version to 2.3.0.dev1 ( #5567 )
2020-06-09 15:47:31 +02:00
Matthew Honnibal
a20ac36bb7
Compile new modules
2020-06-09 15:44:17 +02:00
Matthew Honnibal
04569c0b3e
Fix import
2020-06-09 15:44:08 +02:00
Matthew Honnibal
f4caaa8ad9
Update alignment
2020-06-09 15:43:57 +02:00
Matthew Honnibal
b5ef397639
Add header for align.pxd
2020-06-09 15:43:48 +02:00
Matthew Honnibal
793092d2d8
Fix renaming in GoldCorpus
2020-06-09 15:43:38 +02:00
Matthew Honnibal
36d49a0f13
Fix NewExample class
2020-06-09 15:43:19 +02:00
Matthew Honnibal
f1189dc205
Draft tests for new Example class
2020-06-09 15:43:08 +02:00
Matthew Honnibal
c833ebe1ad
Start tests for new example class
2020-06-09 15:29:05 +02:00
Matthew Honnibal
453cfa14d0
Start drafting new example class
2020-06-09 15:28:42 +02:00
Matthew Honnibal
449000c234
Fix gold_io
2020-06-09 12:43:53 +02:00
Matthew Honnibal
cb08ce3936
Move alignment into Cython
2020-06-09 12:40:41 +02:00
Matthew Honnibal
20a1bdb298
Fix train
2020-06-09 12:33:29 +02:00
Matthew Honnibal
549164c31c
Fix corpus when no raw text supplied
2020-06-09 12:33:14 +02:00