spaCy

mirror of https://github.com/explosion/spaCy.git synced 2024-09-21 19:39:13 +03:00

Author	SHA1	Message	Date
Ines Montani	37c7c85a86	💫 New JSON helpers, training data internals & CLI rewrite (#2932 ) * Support nowrap setting in util.prints * Tidy up and fix whitespace * Simplify script and use read_jsonl helper * Add JSON schemas (see #2928) * Deprecate Doc.print_tree Will be replaced with Doc.to_json, which will produce a unified format * Add Doc.to_json() method (see #2928) Converts Doc objects to JSON using the same unified format as the training data. Method also supports serializing selected custom attributes in the doc._. space. * Remove outdated test * Add write_json and write_jsonl helpers * WIP: Update spacy train * Tidy up spacy train * WIP: Use wasabi for formatting * Add GoldParse helpers for JSON format * WIP: add debug-data command * Fix typo * Add missing import * Update wasabi pin * Add missing import * 💫 Refactor CLI (#2943) To be merged into #2932. ## Description - [x] refactor CLI To use [`wasabi`](https://github.com/ines/wasabi) - [x] use [`black`](https://github.com/ambv/black) for auto-formatting - [x] add `flake8` config - [x] move all messy UD-related scripts to `cli.ud` - [x] make converters function that take the opened file and return the converted data (instead of having them handle the IO) ### Types of change enhancement ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information. * Update wasabi pin * Delete old test * Update errors * Fix typo * Tidy up and format remaining code * Fix formatting * Improve formatting of messages * Auto-format remaining code * Add tok2vec stuff to spacy.train * Fix typo * Update wasabi pin * Fix path checks for when train() is called as function * Reformat and tidy up pretrain script * Update argument annotations * Raise error if model language doesn't match lang * Document new train command	2018-11-30 20:16:14 +01:00
Matthew Honnibal	8fdb9bc278	💫 Add experimental ULMFit/BERT/Elmo-like pretraining (#2931 ) * Add 'spacy pretrain' command * Fix pretrain command for Python 2 * Fix pretrain command * Fix pretrain command	2018-11-15 22:17:16 +01:00
Matthew Honnibal	1f7229f40f	Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop" This reverts commit `c9ba3d3c2d`, reversing changes made to `92c26a35d4`.	2018-03-27 19:23:02 +02:00
ines	82e80ff928	Rename model command to init_model and fix formatting	2017-12-07 09:59:23 +01:00
ines	affd3404ab	Remove old model command (now "vocab")	2017-11-01 13:14:03 +01:00
Explosion Bot	0fc1209421	Wire up new vocab command	2017-10-30 16:14:50 +01:00
ines	fff1028391	Add validate CLI command	2017-10-12 20:05:06 +02:00
Matthew Honnibal	69c7c642c2	Add spacy evaluate	2017-10-01 14:05:04 -05:00
Matthew Honnibal	7be5f30f17	Add profile function	2017-08-21 23:22:49 +02:00
Gyorgy Orosz	e5344b83a3	Ported model cli from v1	2017-08-19 21:45:23 +02:00
ines	fc3ec733ea	Reduce complexity in CLI Remove now redundant model command and move plac annotations to cli files	2017-05-22 12:28:58 +02:00
Matthew Honnibal	baf3ef0ddc	Remove import of removed train_config script	2017-05-21 09:07:34 -05:00
ines	789ce8a45e	Add convert command	2017-04-07 13:04:17 +02:00
ines	7ceaa1614b	Add experimental model init command	2017-03-26 20:51:40 +02:00
ines	0035fd9efe	Add spacy train work in progress	2017-03-23 11:08:41 +01:00
ines	bf240132d7	Add cli.package command to build model packages	2017-03-20 22:50:13 +01:00
ines	ec3e810662	Add directory cli and set up command line interface	2017-03-18 15:14:48 +01:00

17 Commits