spaCy

mirror of https://github.com/explosion/spaCy.git synced 2024-09-21 19:39:13 +03:00

Author	SHA1	Message	Date
ines	2dca9e71a1	Add notes on catastrophic forgetting (see #1496 )	2017-11-06 13:17:02 +01:00
Matthew Honnibal	e033162a1d	Update tagger training example	2017-11-01 21:49:08 +01:00
ines	8f1d3fc3ee	Update textcat example	2017-11-01 17:09:22 +01:00
Matthew Honnibal	dad8f09fba	Fix print statements in text classifier example	2017-11-01 16:34:31 +01:00
ines	bfe17b7df1	Fix begin_training if get_gold_tuples is None	2017-11-01 13:14:31 +01:00
ines	4b196fdf7f	Fix formatting	2017-11-01 00:43:22 +01:00
ines	33af6ac69a	Use even smaller examle size 100 was still too much, so try 20 instead	2017-10-30 19:46:45 +01:00
ines	f02b0af821	Fix path and use smaller example size 500 was too larger and caused laggy rendering	2017-10-30 19:44:35 +01:00
ines	18dde7869a	Update training data docs and add vocab JSONL	2017-10-30 19:40:05 +01:00
ines	b5643d8575	Update intent parser docs and add to usage docs	2017-10-27 04:49:05 +02:00
ines	9dfca0f2f8	Add example for custom intent parser	2017-10-27 03:55:11 +02:00
ines	4d272e25ee	Fix examples	2017-10-27 03:55:04 +02:00
ines	a7b9074b4c	Update textcat training example and docs	2017-10-27 00:48:45 +02:00
ines	b61866a2e4	Update textcat example	2017-10-27 00:32:19 +02:00
ines	f81cc0bd1c	Fix usage of disable_pipes	2017-10-27 00:31:30 +02:00
ines	f57043e6fe	Update docstring	2017-10-26 16:29:08 +02:00
ines	b90e958975	Update tagger and parser examples and add to docs	2017-10-26 16:27:42 +02:00
ines	f1529463a8	Update tagger training example	2017-10-26 16:19:02 +02:00
ines	e44bbb5361	Remove old example	2017-10-26 16:12:41 +02:00
ines	421c3837e8	Fix formatting	2017-10-26 16:11:25 +02:00
ines	4d896171ae	Use plac annotations for arguments	2017-10-26 16:11:20 +02:00
ines	c3b681e5fb	Use plac annotations for arguments and add n_iter	2017-10-26 16:11:05 +02:00
ines	bc2c92f22d	Use plac annotations for arguments	2017-10-26 16:10:56 +02:00
ines	b5c74dbb34	Update parser training example	2017-10-26 15:15:37 +02:00
ines	586b9047fd	Use create_pipe instead of importing the entity recognizer	2017-10-26 15:15:26 +02:00
ines	d425ede7e9	Fix example	2017-10-26 15:15:08 +02:00
ines	9d58673aaf	Update train_ner example for spaCy v2.0	2017-10-26 14:24:12 +02:00
ines	e904075f35	Remove stray print statements	2017-10-26 14:24:00 +02:00
ines	c30258c3a2	Remove old example	2017-10-26 14:23:52 +02:00
ines	615c315d70	Update train_new_entity_type example to use disable_pipes	2017-10-25 14:56:53 +02:00
ines	2b8e7c45e0	Use better training data JSON example	2017-10-24 16:00:56 +02:00
ines	9bf5751064	Pretty-print JSON	2017-10-24 12:22:17 +02:00
ines	6675755005	Add training data JSON example	2017-10-24 12:05:10 +02:00
Jeroen Bobbeldijk	84c6c20d1c	Fix #1444 : fix pipeline logic and wrong paramater in update call	2017-10-22 15:18:36 +02:00
Jeffrey Gerard	5ba970b495	minor cleanup	2017-10-12 12:34:46 -07:00
Jeffrey Gerard	39d3cbfdba	Bugfix example script train_ner_standalone.py, fails after training	2017-10-12 11:39:12 -07:00
Matthew Honnibal	563f46f026	Fix multi-label support for text classification The TextCategorizer class is supposed to support multi-label text classification, and allow training data to contain missing values. For this to work, the gradient of the loss should be 0 when labels are missing. Instead, there was no way to actually denote "missing" in the GoldParse class, and so the TextCategorizer class treated the label set within gold.cats as complete. To fix this, we change GoldParse.cats to be a dict instead of a list. The GoldParse.cats dict should map to floats, with 1. denoting 'present' and 0. denoting 'absent'. Gradients are zeroed for categories absent from the gold.cats dict. A nice bonus is that you can also set values between 0 and 1 for partial membership. You can also set numeric values, if you're using a text classification model that uses an appropriate loss function. Unfortunately this is a breaking change; although the functionality was only recently introduced and hasn't been properly documented yet. I've updated the example script accordingly.	2017-10-05 18:43:02 -05:00
Matthew Honnibal	f1b86dff8c	Update textcat example	2017-10-04 15:12:28 +02:00
Matthew Honnibal	79a94bc166	Update textcat exampe	2017-10-04 14:55:30 +02:00
Matthew Honnibal	cbb1fbef80	Update train_ner_standalone example	2017-10-03 18:49:38 +02:00
Matthew Honnibal	027a5d8b75	Update train_ner_standalone example	2017-09-15 10:36:46 +02:00
Matthew Honnibal	683d81bb49	Update example for adding entity type	2017-09-14 16:15:59 +02:00
Matthew Honnibal	c16ef0a85c	Clarify train textcat example	2017-07-29 21:59:27 +02:00
Matthew Honnibal	54a539a113	Finish text classifier example	2017-07-23 00:34:12 +02:00
Matthew Honnibal	2bc7d87c70	Add example for training text classifier	2017-07-22 20:15:32 +02:00
ines	992559bf9a	Fix formatting and remove unused imports	2017-06-01 12:47:18 +02:00
Matthew Honnibal	5c30466c95	Update NER training example	2017-05-31 13:42:12 +02:00
Matthew Honnibal	2da16adcc2	Add dropout optin for parser and NER Dropout can now be specified in the `Parser.update()` method via the `drop` keyword argument, e.g. nlp.entity.update(doc, gold, drop=0.4) This will randomly drop 40% of features, and multiply the value of the others by 1. / 0.4. This may be useful for generalising from small data sets. This commit also patches the examples/training/train_new_entity_type.py example, to use dropout and fix the output (previously it did not output the learned entity).	2017-04-27 13:18:39 +02:00
Matthew Honnibal	0605b95f2e	Merge branch 'master' of https://github.com/explosion/spaCy	2017-04-18 13:48:00 +02:00
Matthew Honnibal	2f84626417	Fix train_new_entity_type example	2017-04-18 13:47:36 +02:00
Ines Montani	734b0a4e4a	Update train_new_entity_type.py	2017-04-16 23:42:16 +02:00
ines	264af6cd17	Add documentation	2017-04-16 20:37:46 +02:00
ines	c7adca58a9	Tidy up example and only save/test if output_directory is not None	2017-04-16 16:55:01 +02:00
Matthew Honnibal	40e3024241	Move standalone NER training script into examples directory	2017-04-15 16:13:42 +02:00
Matthew Honnibal	c729d72fc6	Add new example for training new entity types	2017-04-15 16:11:06 +02:00
Matthew Honnibal	97b83c74dc	WIP on training example	2017-04-14 23:54:27 +02:00
Matthew Honnibal	ab70f6e18d	Update NER training example	2017-01-27 12:27:10 +01:00
Christos Savvopoulos	c19b83f6ae	use model_dir inside of load_model	2016-12-12 20:23:24 +00:00
Christos Savvopoulos	93cf4af701	actually commit load_ner.py	2016-12-12 20:13:33 +00:00
Christos Savvopoulos	ad54a929f8	train_ner should save vocab; add load_ner example	2016-12-12 20:09:49 +00:00
kendricktan	ba8841234a	Fixed training examples Changes: 1. train_ner won't crash if no data directory is not found 2. Fixed train_tagger expected spacy.gold.GoldParse, got list	2016-10-24 16:09:23 +10:00
kendricktan	9877f3298f	updated training examples to v1.1.2	2016-10-24 11:53:33 +10:00
kendricktan	f77b3dc677	Fixed train_parser examples when model_dir isn't None	2016-10-20 23:40:51 +10:00
kendricktan	d817d57219	Fixed train_ner examples when model_dir isn't None	2016-10-20 21:09:07 +10:00
Matthew Honnibal	3fba897e0f	Update train_parser example	2016-10-16 21:41:14 +02:00
Matthew Honnibal	f787cd29fe	Refactor the pipeline classes to make them more consistent, and remove the redundant blank() constructor.	2016-10-16 21:34:57 +02:00
Matthew Honnibal	4e9727b474	Use new words keyword argument in Doc.	2016-10-16 18:16:25 +02:00
Matthew Honnibal	2508117553	Make train_parser example a bit simpler.	2016-10-16 17:58:37 +02:00
Matthew Honnibal	4574fe87c6	Add example for training parser	2016-10-16 17:05:55 +02:00
Matthew Honnibal	01b42c531f	Update train_tagger script	2016-10-16 16:10:23 +02:00

1 2 3

120 Commits