spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-11-12 22:05:52 +03:00

Author	SHA1	Message	Date
Matthew Honnibal	e90341810c	Update arc_eager oracle	2020-06-21 01:04:02 +02:00
Matthew Honnibal	c58deb3546	Work on parser oracle	2020-06-21 01:01:09 +02:00
Matthew Honnibal	6af99f2f2d	Fix parser declaration	2020-06-20 21:50:17 +02:00
Matthew Honnibal	52edb24f07	Update header	2020-06-20 21:50:06 +02:00
Matthew Honnibal	0c10831b14	Start debugging arc_eager oracle	2020-06-20 21:49:46 +02:00
Matthew Honnibal	2bcb5881d7	Fix parser model	2020-06-20 21:49:31 +02:00
Matthew Honnibal	b7a366b435	Fix compile in ArcEager	2020-06-20 15:56:16 +02:00
Matthew Honnibal	a79f0598a6	Merge branch 'whatif/arrow' of https://github.com/explosion/spaCy into whatif/arrow	2020-06-20 02:36:40 +02:00
Matthew Honnibal	be81577719	Fix oracles	2020-06-20 02:36:12 +02:00
svlandeg	25b0674320	clean up	2020-06-19 11:31:01 +02:00
Matthew Honnibal	bd29b7b14f	Update parser and NER gold stuff	2020-06-19 02:29:16 +02:00
Matthew Honnibal	5ae9e3480d	Return ArcEagerGoldParse from ArcEager	2020-06-19 00:11:59 +02:00
svlandeg	0c6f1f3891	fix BiluoPushDown parsing entities	2020-06-18 13:00:03 +02:00
svlandeg	cd790aaa2a	fix parser tests to work with example (most still failing)	2020-06-18 11:19:22 +02:00
svlandeg	9f43ba839a	throw informative error when running the components with the wrong type of objects	2020-06-18 10:36:05 +02:00
svlandeg	d6c4dd6eea	pipe() takes docs, not examples	2020-06-17 21:29:36 +02:00
Matthew Honnibal	c66f93299e	Remove TokenAnnotation code from nonproj	2020-06-15 18:14:47 +02:00
Matthew Honnibal	95de7efaad	Draft create_gold_state for arc_eager oracle	2020-06-15 18:10:19 +02:00
svlandeg	fd5f199feb	fixing language and scoring tests	2020-06-15 15:02:05 +02:00
Matthew Honnibal	3c0fc10dc4	Remove beam for now (maybe) Remove beam_utils Update setup.py Remove beam	2020-06-14 19:53:29 +02:00
Matthew Honnibal	98ca14f577	Remove GoldParse WIP on removing goldparse Get ArcEager compiling after GoldParse excise Update setup.py Get spacy.syntax compiling after removing GoldParse Rename NewExample -> Example and clean up Clean html files Start updating tests Update Morphologizer	2020-06-14 19:53:30 +02:00
Matthew Honnibal	d53723aa4f	Merge from whatif/arrow	2020-06-14 17:43:59 +02:00
Matthew Honnibal	706e652820	Merge from develop	2020-06-14 17:35:01 +02:00
Matthew Honnibal	9296d71a54	More GoldParse excise	2020-06-14 17:26:54 +02:00
Matthew Honnibal	60d4e5a9e0	WIP on updating transition-system	2020-06-14 17:22:14 +02:00
Matthew Honnibal	7d65615625	WIP start excising GoldParse	2020-06-14 17:11:41 +02:00
Matthew Honnibal	8f941ef527	Update GoldParse	2020-06-13 23:11:29 +02:00
Matthew Honnibal	5564314d32	Suggest approach for GoldParse	2020-06-13 15:43:35 +02:00
Sofie Van Landeghem	c0f4a1e43b	train is from-config by default (#5575 ) * verbose and tag_map options * adding init_tok2vec option and only changing the tok2vec that is specified * adding omit_extra_lookups and verifying textcat config * wip * pretrain bugfix * add replace and resume options * train_textcat fix * raw text functionality * improve UX when KeyError or when input data can't be parsed * avoid unnecessary access to goldparse in TextCat pipe * save performance information in nlp.meta * add noise_level to config * move nn_parser's defaults to config file * multitask in config - doesn't work yet * scorer offering both F and AUC options, need to be specified in config * add textcat verification code from old train script * small fixes to config files * clean up * set default config for ner/parser to allow create_pipe to work as before * two more test fixes * small fixes * cleanup * fix NER pickling + additional unit test * create_pipe as before	2020-06-12 02:02:07 +02:00
Matthew Honnibal	04569c0b3e	Fix import	2020-06-09 15:44:08 +02:00
Matthew Honnibal	d9289712ba	* Make GoldCorpus return dict, not Example * Make Example require a Doc object (previously optional) Clarify methods in GoldCorpus WIP refactor Example Refactor Example.split_sents Fix test Fix augment Update test Update test Fix import Update test_scorer Update Example	2020-06-09 01:01:59 +02:00
Matthew Honnibal	084271c9e9	Remove GoldParse from public API * Move get_parses_from_example to spacy.syntax * Get GoldParse out of Example * Avoid expecting GoldParse input in parser * Add Alignment to spacy.gold.align * Update Example object * Add comment * Update pipeline * Fix imports * Simplify gold_io * WIP on GoldCorpus * Update test * Xfail some gold tests * Remove ignore_misaligned option from GoldCorpus * Fix Example constructor * Update test * Fix usage of Example * Add deprecated_get_gold method on Example * Patch scorer * Fix test * Fix test * Update tests * Xfail a test * Fix passing of make_projective * Pass make_projective by default * Hack data format in Example.from_dict * Update tests * Fix example.from_dict * Update morphologizer * Fix entity linker * Add get_field to TokenAnnotation * Fix Example.get_aligned * Update test * Fix alignment * Fix corpus * Fix GoldCorpus * Handle misaligned * Format * Fix missing import	2020-06-08 22:09:57 +02:00
Matthew Honnibal	6e87ca1f45	Fix imports	2020-06-06 15:36:58 +02:00
Matthew Honnibal	53b00991fd	Fix imports	2020-06-06 15:36:46 +02:00
Matthew Honnibal	7b873ce2b1	Move GoldParse under spacy.syntax	2020-06-06 15:09:43 +02:00
Ines Montani	810fce3bb1	Merge branch 'develop' into master-tmp	2020-06-03 14:36:59 +02:00
Matthw Honnibal	bc94fdabd0	Fix begin_training	2020-05-21 20:46:21 +02:00
Matthw Honnibal	df87c32a40	Pass smaller doc sample into model initialize	2020-05-21 20:17:24 +02:00
Matthw Honnibal	f075655deb	Fix shape inference in begin_training	2020-05-21 19:26:29 +02:00
Ines Montani	24f72c669c	Merge branch 'develop' into master-tmp	2020-05-21 18:39:06 +02:00
adrianeboyd	9393253b66	Remove peeking from Parser.begin_training (#5456 ) Inspect all instances in `Parser.begin_training` rather than only the first 1000.	2020-05-20 15:18:06 +02:00
Matthw Honnibal	24efd54a42	Merge from develop	2020-05-20 12:27:31 +02:00
Matthew Honnibal	333b1a308b	Adapt parser and NER for transformers (#5449 ) * Draft layer for BILUO actions * Fixes to biluo layer * WIP on BILUO layer * Add tests for BILUO layer * Format * Fix transitions * Update test * Link in the simple_ner * Update BILUO tagger * Update __init__ * Import simple_ner * Update test * Import * Add files * Add config * Fix label passing for BILUO and tagger * Fix label handling for simple_ner component * Update simple NER test * Update config * Hack train script * Update BILUO layer * Fix SimpleNER component * Update train_from_config * Add biluo_to_iob helper * Add IOB layer * Add IOBTagger model * Update biluo layer * Update SimpleNER tagger * Update BILUO * Read random seed in train-from-config * Update use of normal_init * Fix normalization of gradient in SimpleNER * Update IOBTagger * Remove print * Tweak masking in BILUO * Add dropout in SimpleNER * Update thinc * Tidy up simple_ner * Fix biluo model * Unhack train-from-config * Update setup.cfg and requirements * Add tb_framework.py for parser model * Try to avoid memory leak in BILUO * Move ParserModel into spacy.ml, avoid need for subclass. * Use updated parser model * Remove incorrect call to model.initializre in PrecomputableAffine * Update parser model * Avoid divide by zero in tagger * Add extra dropout layer in tagger * Refine minibatch_by_words function to avoid oom * Fix parser model after refactor * Try to avoid div-by-zero in SimpleNER * Fix infinite loop in minibatch_by_words * Use SequenceCategoricalCrossentropy in Tagger * Fix parser model when hidden layer * Remove extra dropout from tagger * Add extra nan check in tagger * Fix thinc version * Update tests and imports * Fix test * Update test * Update tests * Fix tests * Fix test Co-authored-by: Ines Montani <ines@ines.io>	2020-05-18 22:23:33 +02:00
Matthew Honnibal	6918d99b6c	Improve GPU usage for train-with-config (#5330 ) * Adjust for no ops in Optimizer * Fix gpu in train-from-config * Update train-from-config script * Fix parser * Fix GPU efficiency of padding backprop	2020-04-20 22:06:28 +02:00
Sofie Van Landeghem	1f9852abc3	Fix parser @ GPU (#5210 ) * ensure self.bias is numpy array in parser model * 2 more little bug fixes for parser on GPU * removing testing GPU statement * remove commented code	2020-03-28 23:09:35 +01:00
Sofie Van Landeghem	9b412516e7	Fixing pickling of the parser (#5218 ) * fix __reduce__ for pickling parser * setting the move object as 'state' during pickling * unskip test_issue4725 - works again	2020-03-27 19:35:26 +01:00
Sofie Van Landeghem	218e1706ac	Bugfix linking vectors (#5196 ) * restore call to _load_vectors * bump to thinc 8.0.0a3 * bump to 3.0.0.dev4	2020-03-25 10:20:11 +01:00
Ines Montani	b0cfab317f	Merge branch 'develop' into refactor/simplify-warnings	2020-03-04 16:38:55 +01:00
Sofie Van Landeghem	a0998868ff	prevent updating cfg if the Model was already defined (#5078 )	2020-03-03 13:58:56 +01:00
Ines Montani	648f61d077	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00

1 2 3 4 5 ...

924 Commits