spaCy

mirror of https://github.com/explosion/spaCy.git synced 2024-12-29 19:36:31 +03:00

Author	SHA1	Message	Date
Matthew Honnibal	c2bbf076a4	Add document length cap for training	2017-11-03 01:54:54 +01:00
ines	37e62ab0e2	Update vector meta in meta.json	2017-11-01 01:25:09 +01:00
Matthew Honnibal	3659a807b0	Remove vector pruning arg from train CLI	2017-10-31 19:21:05 +01:00
Matthew Honnibal	e98451b5f7	Add -prune-vectors argument to spacy.cly.train	2017-10-30 18:00:10 +01:00
ines	d941fc3667	Tidy up CLI	2017-10-27 14:38:39 +02:00
ines	11e3f19764	Fix vectors data added after training (see #1457 )	2017-10-25 16:08:26 +02:00
ines	273e638183	Add vector data to model meta after training (see #1457 )	2017-10-25 16:03:05 +02:00
Matthew Honnibal	a955843684	Increase default number of epochs	2017-10-12 13:13:01 +02:00
Matthew Honnibal	acba2e1051	Fix metadata in training	2017-10-11 08:55:52 +02:00
Matthew Honnibal	74c2c6a58c	Add default name and lang to meta	2017-10-11 08:49:12 +02:00
Matthew Honnibal	5156074df1	Make loading code more consistent in train command	2017-10-10 12:51:20 -05:00
Matthew Honnibal	97c9b5db8b	Patch spacy.train for new pipeline management	2017-10-09 23:41:16 -05:00
Matthew Honnibal	808d8740d6	Remove print statement	2017-10-09 08:45:20 -05:00
Matthew Honnibal	0f41b25f60	Add speed benchmarks to metadata	2017-10-09 08:05:37 -05:00
Matthew Honnibal	be4f0b6460	Update defaults	2017-10-08 02:08:12 -05:00
Matthew Honnibal	9d66a915da	Update training defaults	2017-10-07 21:02:38 -05:00
Matthew Honnibal	c6cd81f192	Wrap try/except around model saving	2017-10-05 08:14:24 -05:00
Matthew Honnibal	5743b06e36	Wrap model saving in try/except	2017-10-05 08:12:50 -05:00
Matthew Honnibal	8902df44de	Fix component disabling during training	2017-10-02 21:07:23 +02:00
Matthew Honnibal	c617d288d8	Update pipeline component names in spaCy train	2017-10-02 17:20:19 +02:00
Matthew Honnibal	ac8481a7b0	Print NER loss	2017-09-28 08:05:31 -05:00
Matthew Honnibal	542ebfa498	Improve defaults	2017-09-27 18:54:37 -05:00
Matthew Honnibal	dcb86bdc43	Default batch size to 32	2017-09-27 11:48:19 -05:00
ines	1ff62eaee7	Fix option shortcut to avoid conflict	2017-09-26 17:59:34 +02:00
ines	7fdfb78141	Add version option to cli.train	2017-09-26 17:34:52 +02:00
Matthew Honnibal	698fc0d016	Remove merge artefact	2017-09-26 08:31:37 -05:00
Matthew Honnibal	defb68e94f	Update feature/noshare with recent develop changes	2017-09-26 08:15:14 -05:00
ines	edf7e4881d	Add meta.json option to cli.train and add relevant properties Add accuracy scores to meta.json instead of accuracy.json and replace all relevant properties like lang, pipeline, spacy_version in existing meta.json. If not present, also add name and version placeholders to make it packagable.	2017-09-25 19:00:47 +02:00
Matthew Honnibal	204b58c864	Fix evaluation during training	2017-09-24 05:01:03 -05:00
Matthew Honnibal	dc3a623d00	Remove unused update_shared argument	2017-09-24 05:00:37 -05:00
Matthew Honnibal	4348c479fc	Merge pre-trained vectors and noshare patches	2017-09-22 20:07:28 -05:00
Matthew Honnibal	e93d43a43a	Fix training with preset vectors	2017-09-22 20:00:40 -05:00
Matthew Honnibal	a2357cce3f	Set random seed in train script	2017-09-23 02:57:31 +02:00
Matthew Honnibal	0a9016cade	Fix serialization during training	2017-09-21 13:06:45 -05:00
Matthew Honnibal	20193371f5	Don't share CNN, to reduce complexities	2017-09-21 14:59:48 +02:00
Matthew Honnibal	1d73dec8b1	Refactor train script	2017-09-20 19:17:10 -05:00
Matthew Honnibal	a0c4b33d03	Support resuming a model during spacy train	2017-09-18 18:04:47 -05:00
Matthew Honnibal	8496d76224	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-09-14 09:21:20 -05:00
Matthew Honnibal	24ff6b0ad9	Fix parsing and tok2vec models	2017-09-06 05:50:58 -05:00
Matthew Honnibal	e920885676	Fix pickle during train	2017-09-02 12:46:01 -05:00
Matthew Honnibal	7a6edeea68	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-08-20 12:55:39 -05:00
Matthew Honnibal	f2f9229964	Fix name of update_shared flag	2017-08-20 18:19:06 +02:00
Matthew Honnibal	84bb543e4d	Add gold_preproc flag to cli/train	2017-08-20 11:07:00 -05:00
Matthew Honnibal	11c31d285c	Restore changes from nn-beam-parser	2017-08-18 22:26:12 +02:00
Matthew Honnibal	52c180ecf5	Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop" This reverts commit `ea8de11ad5`, reversing changes made to `08e443e083`.	2017-08-14 13:00:23 +02:00
Matthew Honnibal	8870d491f1	Remove redundant pickling during training	2017-08-12 08:55:53 -05:00
Matthew Honnibal	0a566dc320	Add update_tensors flag to Language.update. Experimental, re #1182	2017-08-06 02:18:12 +02:00
Matthew Honnibal	c52fde40f4	Improve train CLI	2017-06-04 20:18:37 -05:00
Matthew Honnibal	21eef90dbc	Support specifying which GPU	2017-06-03 16:10:23 -05:00
Matthew Honnibal	43353b5413	Improve train CLI script	2017-06-03 13:28:20 -05:00
Matthew Honnibal	8a693c2605	Write binary file during training	2017-05-31 02:59:18 +02:00
Matthew Honnibal	49235017bf	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-05-27 16:34:28 -05:00
Matthew Honnibal	5e4312feed	Evaluate loaded class, to ensure save/load works	2017-05-27 15:47:02 -05:00
ines	086a06e7d7	Fix CLI docstrings and add command as first argument Workaround for Plac	2017-05-27 20:01:46 +02:00
Matthew Honnibal	de13fe0305	Remove length cap on sentences	2017-05-27 08:20:32 -05:00
Matthew Honnibal	d65f99a720	Improve model saving in train script	2017-05-26 05:52:09 -05:00
Matthew Honnibal	df8015f05d	Tweaks to train script	2017-05-25 17:15:24 -05:00
Matthew Honnibal	702fe74a4d	Clean up spacy.cli.train	2017-05-25 16:16:30 -05:00
Matthew Honnibal	135a13790c	Disable gold preprocessing	2017-05-24 20:10:20 -05:00
Matthew Honnibal	3959d778ac	Revert "Revert "WIP on improving parser efficiency"" This reverts commit `532afef4a8`.	2017-05-23 03:06:53 -05:00
Matthew Honnibal	532afef4a8	Revert "WIP on improving parser efficiency" This reverts commit `bdaac7ab44`.	2017-05-23 03:05:25 -05:00
Matthew Honnibal	bdaac7ab44	WIP on improving parser efficiency	2017-05-23 02:59:31 -05:00
Matthew Honnibal	6e8dce2c05	Fix train command line args	2017-05-22 10:41:39 -05:00
Matthew Honnibal	ae8cf70dc1	Fix CLI train signature	2017-05-22 06:13:39 -05:00
ines	fc3ec733ea	Reduce complexity in CLI Remove now redundant model command and move plac annotations to cli files	2017-05-22 12:28:58 +02:00
Matthew Honnibal	bc2294d7f1	Add support for fiddly hyper-parameters to train func	2017-05-22 04:51:08 -05:00
Matthew Honnibal	4e0988605a	Pass through non-projective=True	2017-05-22 04:51:08 -05:00
Matthew Honnibal	e14533757b	Use averaged params for evaluation	2017-05-22 04:51:08 -05:00
Matthew Honnibal	4c9202249d	Refactor training, to fix memory leak	2017-05-21 09:07:06 -05:00
Matthew Honnibal	3376d4d6e8	Update the train script, fixing GPU memory leak	2017-05-19 18:15:50 -05:00
Matthew Honnibal	ca70b08661	Fix GPU training and evaluation	2017-05-18 08:30:33 -05:00
Matthew Honnibal	fc8d3a112c	Add util.env_opt support: Can set hyper params through environment variables.	2017-05-18 04:36:53 -05:00
Matthew Honnibal	793430aa7a	Get spaCy train command working with neural network * Integrate models into pipeline * Add basic serialization (maybe incorrect) * Fix pickle on vocab	2017-05-17 12:04:50 +02:00
Matthew Honnibal	8cf097ca88	Redesign training to integrate NN components * Obsolete .parser, .entity etc names in favour of .pipeline * Components no longer create models on initialization * Models created by loading method (from_disk(), from_bytes() etc), or .begin_training() * Add .predict(), .set_annotations() methods in components * Pass state through pipeline, to allow components to share information more flexibly.	2017-05-16 16:17:30 +02:00
Matthew Honnibal	5211645af3	Get data flowing through pipeline. Needs redesign	2017-05-16 11:21:59 +02:00
Matthew Honnibal	a9edb3aa1d	Improve integration of NN parser, to support unified training API	2017-05-15 21:53:27 +02:00
ines	59c3b9d4dd	Tidy up CLI and fix print functions	2017-05-07 23:25:29 +02:00
Matthew Honnibal	4f9657b42b	Fix reporting if no dev data with train	2017-04-23 22:27:10 +02:00
ines	3a9710f356	Pass dev_scores to print_progress correctly (resolves #1008 ) Only read scores attribute if command is used with dev_data, otherwise default dev_scores to empty dict.	2017-04-23 15:58:40 +02:00
Matthew Honnibal	89a4f262fc	Fix training methods	2017-04-16 13:00:37 -05:00
ines	d24589aa72	Clean up imports, unused code, whitespace, docstrings	2017-04-15 12:05:47 +02:00
ines	9952d3b08a	Fix whitespace	2017-04-07 13:02:05 +02:00
Matthew Honnibal	2efdbc08ff	Make training work with directories	2017-03-26 08:46:44 -05:00
Matthew Honnibal	9dcb58aaaf	Merge CLI changes	2017-03-26 07:30:45 -05:00
Matthew Honnibal	6b7f7a2060	Connect parser L1 option to train CLI	2017-03-26 07:24:07 -05:00
Matthew Honnibal	dec5571bf3	Update train CLI	2017-03-26 07:16:52 -05:00
ines	53cf2f1c0e	Make dev data optional	2017-03-26 11:48:17 +02:00
ines	0035fd9efe	Add spacy train work in progress	2017-03-23 11:08:41 +01:00

... 2 3 4 5 6

288 Commits