Commit Graph

945 Commits

Author SHA1 Message Date
svlandeg
1720c58287 Merge branch 'whatif/arrow' of https://github.com/explosion/spaCy into whatif/arrow 2020-06-22 15:15:29 +02:00
svlandeg
bf819ba302 Merge remote-tracking branch 'upstream/develop' into whatif/arrow
# Conflicts:
#	spacy/cli/train.py
#	spacy/gold.pyx
#	spacy/ml/models/multi_task.py
#	spacy/ml/models/simple_ner.py
#	spacy/ml/models/textcat.py
#	spacy/ml/models/tok2vec.py
#	spacy/pipeline/pipes.pyx
#	spacy/pipeline/simple_ner.py
#	spacy/scorer.py
#	spacy/tests/parser/test_add_label.py
#	spacy/tests/parser/test_nn_beam.py
#	spacy/tests/pipeline/test_morphologizer.py
#	spacy/tests/test_scorer.py
#	spacy/tests/test_util.py
#	spacy/util.py
2020-06-22 15:15:20 +02:00
Matthew Honnibal
ad50c8baca Add missing costs to NER oracle 2020-06-22 14:30:08 +02:00
Matthew Honnibal
79288e7110 Merge from remote 2020-06-22 00:58:18 +02:00
Matthew Honnibal
6e4d486b1e Debugging 2020-06-22 00:54:38 +02:00
Matthew Honnibal
e9860daf4b Update ArcEager oracle
Fix Break oracle
2020-06-22 00:54:38 +02:00
Matthew Honnibal
ecf192aa70 Use get_aligned_parse in ArcEager 2020-06-22 00:54:38 +02:00
Matthew Honnibal
87f5348e17 Update nonproj 2020-06-22 00:54:38 +02:00
Matthew Honnibal
5ca4c19ef2 Work on parser oracle
Update arc_eager oracle

Restore ArcEager.get_cost function

Update transition system
2020-06-22 00:54:38 +02:00
Matthew Honnibal
2efe01bf26 Fix parser declaration 2020-06-22 00:54:38 +02:00
Matthew Honnibal
29d39d8a34 Update header 2020-06-22 00:54:38 +02:00
Matthew Honnibal
456e27dc8b Start debugging arc_eager oracle 2020-06-22 00:54:38 +02:00
Matthew Honnibal
b60eede321 Fix parser model 2020-06-22 00:54:38 +02:00
svlandeg
a427ca9355 clean up 2020-06-22 00:46:08 +02:00
Matthew Honnibal
39117de4f9 Fix compile in ArcEager 2020-06-22 00:46:08 +02:00
Matthew Honnibal
7544c21f5b Update transition system 2020-06-21 01:12:05 +02:00
Matthew Honnibal
318a046fb0 Restore ArcEager.get_cost function 2020-06-21 01:11:08 +02:00
Matthew Honnibal
e90341810c Update arc_eager oracle 2020-06-21 01:04:02 +02:00
Matthew Honnibal
c58deb3546 Work on parser oracle 2020-06-21 01:01:09 +02:00
svlandeg
5cb812e0ab fix NER warn empty lookups (cf PR #5588) 2020-06-20 22:04:18 +02:00
Matthew Honnibal
6af99f2f2d Fix parser declaration 2020-06-20 21:50:17 +02:00
Matthew Honnibal
52edb24f07 Update header 2020-06-20 21:50:06 +02:00
Matthew Honnibal
0c10831b14 Start debugging arc_eager oracle 2020-06-20 21:49:46 +02:00
Matthew Honnibal
2bcb5881d7 Fix parser model 2020-06-20 21:49:31 +02:00
Matthew Honnibal
b7a366b435 Fix compile in ArcEager 2020-06-20 15:56:16 +02:00
Ines Montani
52728d8fa3 Merge branch 'develop' into master-tmp 2020-06-20 15:52:00 +02:00
Matthew Honnibal
a79f0598a6 Merge branch 'whatif/arrow' of https://github.com/explosion/spaCy into whatif/arrow 2020-06-20 02:36:40 +02:00
Matthew Honnibal
be81577719 Fix oracles 2020-06-20 02:36:12 +02:00
svlandeg
25b0674320 clean up 2020-06-19 11:31:01 +02:00
Matthew Honnibal
bd29b7b14f Update parser and NER gold stuff 2020-06-19 02:29:16 +02:00
Matthew Honnibal
5ae9e3480d Return ArcEagerGoldParse from ArcEager 2020-06-19 00:11:59 +02:00
svlandeg
0c6f1f3891 fix BiluoPushDown parsing entities 2020-06-18 13:00:03 +02:00
svlandeg
cd790aaa2a fix parser tests to work with example (most still failing) 2020-06-18 11:19:22 +02:00
svlandeg
9f43ba839a throw informative error when running the components with the wrong type of objects 2020-06-18 10:36:05 +02:00
svlandeg
d6c4dd6eea pipe() takes docs, not examples 2020-06-17 21:29:36 +02:00
Matthew Honnibal
c66f93299e Remove TokenAnnotation code from nonproj 2020-06-15 18:14:47 +02:00
Matthew Honnibal
95de7efaad Draft create_gold_state for arc_eager oracle 2020-06-15 18:10:19 +02:00
svlandeg
fd5f199feb fixing language and scoring tests 2020-06-15 15:02:05 +02:00
Adriane Boyd
c482f20778
Fix and add warnings related to spacy-lookups-data (#5588)
* Fix warning message for lemmatization tables

* Add a warning when the `lexeme_norm` table is empty. (Given the
relatively lang-specific loading for `Lookups`, it seemed like too much
overhead to dynamically extract the list of languages, so for now it's
hard-coded.)
2020-06-15 14:56:04 +02:00
Matthew Honnibal
3c0fc10dc4 Remove beam for now (maybe)
Remove beam_utils

Update setup.py

Remove beam
2020-06-14 19:53:29 +02:00
Matthew Honnibal
98ca14f577 Remove GoldParse
WIP on removing goldparse

Get ArcEager compiling after GoldParse excise

Update setup.py

Get spacy.syntax compiling after removing GoldParse

Rename NewExample -> Example and clean up

Clean html files

Start updating tests

Update Morphologizer
2020-06-14 19:53:30 +02:00
Matthew Honnibal
d53723aa4f Merge from whatif/arrow 2020-06-14 17:43:59 +02:00
Matthew Honnibal
706e652820 Merge from develop 2020-06-14 17:35:01 +02:00
Matthew Honnibal
9296d71a54 More GoldParse excise 2020-06-14 17:26:54 +02:00
Matthew Honnibal
60d4e5a9e0 WIP on updating transition-system 2020-06-14 17:22:14 +02:00
Matthew Honnibal
7d65615625 WIP start excising GoldParse 2020-06-14 17:11:41 +02:00
Matthew Honnibal
8f941ef527 Update GoldParse 2020-06-13 23:11:29 +02:00
Matthew Honnibal
5564314d32 Suggest approach for GoldParse 2020-06-13 15:43:35 +02:00
Sofie Van Landeghem
c0f4a1e43b
train is from-config by default (#5575)
* verbose and tag_map options

* adding init_tok2vec option and only changing the tok2vec that is specified

* adding omit_extra_lookups and verifying textcat config

* wip

* pretrain bugfix

* add replace and resume options

* train_textcat fix

* raw text functionality

* improve UX when KeyError or when input data can't be parsed

* avoid unnecessary access to goldparse in TextCat pipe

* save performance information in nlp.meta

* add noise_level to config

* move nn_parser's defaults to config file

* multitask in config - doesn't work yet

* scorer offering both F and AUC options, need to be specified in config

* add textcat verification code from old train script

* small fixes to config files

* clean up

* set default config for ner/parser to allow create_pipe to work as before

* two more test fixes

* small fixes

* cleanup

* fix NER pickling + additional unit test

* create_pipe as before
2020-06-12 02:02:07 +02:00
Matthew Honnibal
04569c0b3e Fix import 2020-06-09 15:44:08 +02:00