spaCy

mirror of https://github.com/explosion/spaCy.git synced 2026-01-27 18:46:01 +03:00

History

Sofie Van Landeghem 311133e579 Train textcat with config (#5143 ) * bring back default build_text_classifier method * remove _set_dims_ hack in favor of proper dim inference * add tok2vec initialize to unit test * small fixes * add unit test for various textcat config settings * logistic output layer does not have nO * fix window_size setting * proper fix * fix W initialization * Update textcat training example * Use ml_datasets * Convert training data to `Example` format * Use `n_texts` to set proportionate dev size * fix _init renaming on latest thinc * avoid setting a non-existing dim * update to thinc==8.0.0a2 * add BOW and CNN defaults for easy testing * various experiments with train_textcat script, fix softmax activation in textcat bow * allow textcat train script to work on other datasets as well * have dataset as a parameter * train textcat from config, with example config * add config for training textcat * formatting * fix exclusive_classes * fixing BOW for GPU * bump thinc to 8.0.0a3 (not published yet so CI will fail) * add in link_vectors_to_models which got deleted Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>		2020-03-29 19:40:36 +02:00
..
cli	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
displacy	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
lang	Remove unicode declarations	2020-03-26 15:18:32 +01:00
matcher	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
ml	Train textcat with config (#5143 )	2020-03-29 19:40:36 +02:00
pipeline	Train textcat with config (#5143 )	2020-03-29 19:40:36 +02:00
syntax	Fix parser @ GPU (#5210 )	2020-03-28 23:09:35 +01:00
tests	Train textcat with config (#5143 )	2020-03-29 19:40:36 +02:00
tokens	bugfix in span similarity (#5155 )	2020-03-29 13:56:07 +02:00
__init__.pxd	* Seems to be working after refactor. Need to wire up more POS tag features, and wire up save/load of POS tags.	2014-10-24 02:23:42 +11:00
__init__.py	Simplify warnings	2020-02-28 12:20:23 +01:00
__main__.py	Update spaCy for thinc 8.0.0 (#4920 )	2020-01-29 17:06:46 +01:00
_ml.py	take care of global vectors in multiprocessing (#5081 )	2020-03-03 13:58:22 +01:00
about.py	Bugfix linking vectors (#5196 )	2020-03-25 10:20:11 +01:00
analysis.py	Simplify warnings	2020-02-28 12:20:23 +01:00
attrs.pxd	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
attrs.pyx	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
compat.py	Merge branch 'develop' into refactor/remove-symlinks	2020-02-18 17:22:20 +01:00
errors.py	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
glossary.py	Tidy up and auto-format	2020-02-18 15:38:18 +01:00
gold.pxd	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00
gold.pyx	Check whether doc is instantiated in Example.get_gold_parses() (#5167 )	2020-03-29 13:57:00 +02:00
kb.pxd	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00
kb.pyx	Merge branch 'develop' into refactor/simplify-warnings	2020-03-04 16:38:55 +01:00
language.py	Fix argument	2020-03-26 14:09:02 +01:00
lemmatizer.py	Drop Python 2.7 and 3.5 (#4828 )	2019-12-22 01:53:56 +01:00
lexeme.pxd	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00
lexeme.pyx	Simplify warnings	2020-02-28 12:20:23 +01:00
lookups.py	Drop Python 2.7 and 3.5 (#4828 )	2019-12-22 01:53:56 +01:00
morphology.pxd	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00
morphology.pyx	Fix small errors	2020-03-26 13:47:31 +01:00
parts_of_speech.pxd	Add support for Universal Dependencies v2.0	2017-03-03 13:17:34 +01:00
parts_of_speech.pyx	Drop Python 2.7 and 3.5 (#4828 )	2019-12-22 01:53:56 +01:00
schemas.py	Add sent_start to pattern schema	2020-03-26 14:05:40 +01:00
scorer.py	Fix GoldParse init when token count differs (#5191 )	2020-03-26 10:46:23 +01:00
strings.pxd	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00
strings.pyx	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00
structs.pxd	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00
symbols.pxd	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
symbols.pyx	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
tokenizer.pxd	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00
tokenizer.pyx	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
typedefs.pxd	Update spaCy for thinc 8.0.0 (#4920 )	2020-01-29 17:06:46 +01:00
typedefs.pyx	Tidy up rest	2017-10-27 21:07:59 +02:00
util.py	Tok2Vec: extract-embed-encode (#5102 )	2020-03-08 13:23:18 +01:00
vectors.pyx	Merge branch 'master' into tmp/sync	2020-03-26 13:38:14 +01:00
vocab.pxd	Tidy up compiler flags and imports (#5071 )	2020-03-02 11:48:10 +01:00
vocab.pyx	Tidy up and auto-format	2020-02-18 15:38:18 +01:00