spaCy

mirror of https://github.com/explosion/spaCy.git synced 2024-11-11 12:18:04 +03:00

Author	SHA1	Message	Date
Ines Montani	c9bd0e5a96	Set version to 2.1.2	2019-03-22 13:44:47 +01:00
Matthew Honnibal	5a53e9358a	Set version to 2.1.1	2019-03-20 00:59:45 +01:00
Ines Montani	f0c1efcb00	Set version to 2.1.0	2019-03-17 22:42:58 +01:00
Matthew Honnibal	c6be9964ec	Set version to v2.1.0.dev1	2019-03-16 21:47:41 +01:00
Ines Montani	2eecd756fa	Update package name	2019-03-16 14:43:53 +01:00
Ines Montani	f55a52a2dd	Set version to v2.1.0.dev0	2019-03-16 13:47:03 +01:00
Matthew Honnibal	6aab2d8533	Set version to v2.1.0a13	2019-03-12 15:14:06 +01:00
Matthew Honnibal	062934aa12	Set version to v2.1.0a12	2019-03-11 22:26:19 +01:00
Matthew Honnibal	4e8a07c7d3	Set version to v2.1.0a11	2019-03-11 10:45:06 +01:00
Matthew Honnibal	656edcb984	Set version to v2.1.0a10	2019-02-27 12:26:13 +01:00
Matthew Honnibal	3cdd3eb518	Set version to v2.1.0a9	2019-02-25 21:55:19 +01:00
Matthew Honnibal	5882d82915	Set version to v2.1.0a9.dev2	2019-02-24 16:42:06 +01:00
Matthew Honnibal	909a9d9932	Set version to v2.1.0a9.dev1	2019-02-23 13:10:42 +01:00
Matthew Honnibal	829c9091a4	Set version to v2.1.0a9.dev0	2019-02-21 17:13:34 +01:00
Matthew Honnibal	7d529ebdfb	Set version to v2.1.0a8	2019-02-21 12:09:34 +01:00
Matthew Honnibal	f75be6e7be	Set version to v2.1.0a8.dev1	2019-02-21 11:57:06 +01:00
Matthew Honnibal	7f02464494	Set version to v2.1.0a8.dev0	2019-02-21 11:42:23 +01:00
Matthew Honnibal	7d4a52a4d0	Set version to v2.1.0a7	2019-02-16 17:48:34 +01:00
Matthew Honnibal	07617b6b7f	Set version to v2.1.0a7.dev12	2019-02-16 17:30:29 +01:00
Matthew Honnibal	1dc314bada	Set version to v2.1.0a7.dev11	2019-02-16 17:02:49 +01:00
Matthew Honnibal	2ef227c313	Set version to v2.1.0a7.dev1	2019-02-16 16:22:46 +01:00
Matthew Honnibal	22923b9cb1	Set version to v2.1.0a7.dev9	2019-02-16 15:47:19 +01:00
Matthew Honnibal	e0c91a4c8d	Set version to 2.1.0a7	2019-02-16 14:43:38 +01:00
Matthew Honnibal	58aac58631	Set version to v2.1.0a7.dev8	2019-02-15 12:39:26 +01:00
Matthew Honnibal	5f1abe2cc7	Set version to v2.1.0a7.dev7	2019-02-15 10:30:53 +01:00
Matthew Honnibal	dcf79c5ef3	Set version to v2.1.0a7.dev6	2019-02-14 20:12:02 +01:00
Matthew Honnibal	aebf71bc72	Set version to v2.1.0a7.dev5	2019-02-14 15:51:42 +01:00
Matthew Honnibal	1831e1423d	Set version to v2.1.0a7.dev4	2019-02-13 23:08:40 +11:00
Matthew Honnibal	63dc4234a3	Set version to v2.1.0a7.dev3	2019-02-13 22:53:10 +11:00
Matthew Honnibal	b7ea39564f	Set version to v2.1.0a7.dev2	2019-02-13 22:52:43 +11:00
Matthew Honnibal	dbeebfa3a2	Set version to v2.1.0a7.dev1	2019-02-08 01:54:01 +11:00
Matthew Honnibal	27e3f98cae	Set version to v2.1.0a7.dev0	2019-02-01 18:06:34 +11:00
Matthew Honnibal	5a4737df09	Set version to 2.1.0a6	2019-01-21 18:32:34 +01:00
Matthew Honnibal	246538be2e	Set version to 2.1.0a6.dev1	2019-01-21 15:12:17 +01:00
Matthew Honnibal	fe4e68cb71	Set version to v2.1.0a6.dev0	2019-01-05 14:44:42 +01:00
Matthew Honnibal	978d8be8f9	Set version to v2.1.0a5	2018-12-21 00:26:39 +01:00
Matthew Honnibal	d8d27f9129	Set version to v2.1.0a5.dev0	2018-12-20 18:45:34 +01:00
Matthew Honnibal	a7b085ae46	Set version back to 2.1.0a4	2018-12-03 02:03:26 +01:00
Matthew Honnibal	8e9a4d2f5e	Increment version to 2.1.0a5	2018-12-03 01:59:50 +01:00
Matthew Honnibal	a31d557f2d	Set version to v2.1.0a4	2018-12-01 14:40:03 +01:00
Ines Montani	eddeb36c96	💫 Tidy up and auto-format .py files (#2983 ) <!--- Provide a general summary of your changes in the title. --> ## Description - [x] Use [`black`](https://github.com/ambv/black) to auto-format all `.py` files. - [x] Update flake8 config to exclude very large files (lemmatization tables etc.) - [x] Update code to be compatible with flake8 rules - [x] Fix various small bugs, inconsistencies and messy stuff in the language data - [x] Update docs to explain new code style (`black`, `flake8`, when to use `# fmt: off` and `# fmt: on` and what `# noqa` means) Once #2932 is merged, which auto-formats and tidies up the CLI, we'll be able to run `flake8 spacy` actually get meaningful results. At the moment, the code style and linting isn't applied automatically, but I'm hoping that the new [GitHub Actions](https://github.com/features/actions) will let us auto-format pull requests and post comments with relevant linting information. ### Types of change enhancement, code style ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information.	2018-11-30 17:03:03 +01:00
Matthew Honnibal	87da5bcf5b	Set version to v2.1.0a3	2018-11-28 18:22:09 +01:00
Matthew Honnibal	c9f6acc564	Set version to 2.1.0a3.dev0	2018-11-27 05:15:27 +01:00
Matthew Honnibal	5fc98ade04	Set version to 2.1.0a2	2018-11-08 09:56:56 +01:00
Matthew Honnibal	b9f0588580	Set version to v2.1.0a1	2018-08-15 17:22:39 +02:00
Matthew Honnibal	1b2a5869ab	Set version to v2.1.0a2.dev0	2018-08-15 15:38:52 +02:00
Matthew Honnibal	4336397ecb	Update develop from master	2018-08-14 03:04:28 +02:00
Matthew Honnibal	85000ea13b	Increment version to 2.0.13.dev2	2018-08-10 00:43:55 +02:00
Matthew Honnibal	ae7fc42a41	Increment version to v2.0.13.dev1	2018-08-10 00:14:31 +02:00
Matthew Honnibal	3fb828352d	Set version to 2.0.13.dev0	2018-08-09 23:49:34 +02:00
Matthew Honnibal	90c269e1a9	Set about to v2.0.12 release	2018-07-21 15:09:42 +02:00
Matthew Honnibal	1a1c7304cf	Set version to 2.0.12.dev1	2018-07-21 13:08:01 +02:00
Matthew Honnibal	3eb446e0a5	Require thinc 6.11.1 and prepare for release to spacy-nightly	2018-05-20 19:00:34 +02:00
Matthew Honnibal	c0e596283b	Set version to 2.1.0a0	2018-05-03 14:00:11 +02:00
Matthew Honnibal	2c4a6d66fa	Merge master into develop. Big merge, many conflicts -- need to review	2018-04-29 14:49:26 +02:00
Matthew Honnibal	97851d2c4e	Increment version to v2.0.12.dev0	2018-04-10 22:20:16 +02:00
Matthew Honnibal	0c7fab4443	Set version to 2.0.11	2018-04-04 11:19:11 +02:00
Matthew Honnibal	f7e6313b43	Increment version to v2.0.11.dev0	2018-04-03 20:58:47 +02:00
Ines Montani	3141e04822	💫 New system for error messages and warnings (#2163 ) * Add spacy.errors module * Update deprecation and user warnings * Replace errors and asserts with new error message system * Remove redundant asserts * Fix whitespace * Add messages for print/util.prints statements * Fix typo * Fix typos * Move CLI messages to spacy.cli._messages * Add decorator to display error code with message An implementation like this is nice because it only modifies the string when it's retrieved from the containing class – so we don't have to worry about manipulating tracebacks etc. * Remove unused link in spacy.about * Update errors for invalid pipeline components * Improve error for unknown factories * Add displaCy warnings * Update formatting consistency * Move error message to spacy.errors * Update errors and check if doc returned by component is None	2018-04-03 15:50:31 +02:00
Matthew Honnibal	1f7229f40f	Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop" This reverts commit `c9ba3d3c2d`, reversing changes made to `92c26a35d4`.	2018-03-27 19:23:02 +02:00
Matthew Honnibal	d2118792e7	Merge changes from master	2018-03-27 13:38:41 +02:00
Matthew Honnibal	5430c43298	Set about to spacy-nightly	2018-03-25 19:30:14 +02:00
Matthew Honnibal	d566e673bf	Set version to v2.0.10	2018-03-24 18:09:03 +01:00
Matthew Honnibal	bede11b67c	Improve label management in parser and NER (#2108 ) This patch does a few smallish things that tighten up the training workflow a little, and allow memory use during training to be reduced by letting the GoldCorpus stream data properly. Previously, the parser and entity recognizer read and saved labels as lists, with extra labels noted separately. Lists were used becaue ordering is very important, to ensure that the label-to-class mapping is stable. We now manage labels as nested dictionaries, first keyed by the action, and then keyed by the label. Values are frequencies. The trick is, how do we save new labels? We need to make sure we iterate over these in the same order they're added. Otherwise, we'll get different class IDs, and the model's predictions won't make sense. To allow stable sorting, we map the new labels to negative values. If we have two new labels, they'll be noted as having "frequency" -1 and -2. The next new label will then have "frequency" -3. When we sort by (frequency, label), we then get a stable sort. Storing frequencies then allows us to make the next nice improvement. Previously we had to iterate over the whole training set, to pre-process it for the deprojectivisation. This led to storing the whole training set in memory. This was most of the required memory during training. To prevent this, we now store the frequencies as we stream in the data, and deprojectivize as we go. Once we've built the frequencies, we can then apply a frequency cut-off when we decide how many classes to make. Finally, to allow proper data streaming, we also have to have some way of shuffling the iterator. This is awkward if the training files have multiple documents in them. To solve this, the GoldCorpus class now writes the training data to disk in msgpack files, one per document. We can then shuffle the data by shuffling the paths. This is a squash merge, as I made a lot of very small commits. Individual commit messages below. * Simplify label management for TransitionSystem and its subclasses * Fix serialization for new label handling format in parser * Simplify and improve GoldCorpus class. Reduce memory use, write to temp dir * Set actions in transition system * Require thinc 6.11.1.dev4 * Fix error in parser init * Add unicode declaration * Fix unicode declaration * Update textcat test * Try to get model training on less memory * Print json loc for now * Try rapidjson to reduce memory use * Remove rapidjson requirement * Try rapidjson for reduced mem usage * Handle None heads when projectivising * Stream json docs * Fix train script * Handle projectivity in GoldParse * Fix projectivity handling * Add minibatch_by_words util from ud_train * Minibatch by number of words in spacy.cli.train * Move minibatch_by_words util to spacy.util * Fix label handling * More hacking at label management in parser * Fix encoding in msgpack serialization in GoldParse * Adjust batch sizes in parser training * Fix minibatch_by_words * Add merge_subtokens function to pipeline.pyx * Register merge_subtokens factory * Restore use of msgpack tmp directory * Use minibatch-by-words in train * Handle retokenization in scorer * Change back-off approach for missing labels. Use 'dep' label * Update NER for new label management * Set NER tags for over-segmented words * Fix label alignment in gold * Fix label back-off for infrequent labels * Fix int type in labels dict key * Fix int type in labels dict key * Update feature definition for 8 feature set * Update ud-train script for new label stuff * Fix json streamer * Print the line number if conll eval fails * Update children and sentence boundaries after deprojectivisation * Export set_children_from_heads from doc.pxd * Render parses during UD training * Remove print statement * Require thinc 6.11.1.dev6. Try adding wheel as install_requires * Set different dev version, to flush pip cache * Update thinc version * Update GoldCorpus docs * Remove print statements * Fix formatting and links [ci skip]	2018-03-19 02:58:08 +01:00
Matthew Honnibal	e3be3d65b3	Version as v2.0.10.dev0	2018-03-15 17:31:22 +01:00
Matthew Honnibal	9aeec9c242	Increment dev version	2018-03-11 01:58:21 +01:00
Matthew Honnibal	fa9fd21620	Increment dev version	2018-03-11 00:41:54 +01:00
Matthew Honnibal	e7deadb519	Set version to 2.1.0.dev1	2018-02-23 16:22:24 +01:00
Matthew Honnibal	307aefe131	Increment version to v2.0.9	2018-02-22 17:07:53 +01:00
Matthew Honnibal	66496ac8e1	Set version to v2.1.0.dev0	2018-02-18 13:48:39 +01:00
Matthew Honnibal	1b3c98e01b	Set version to v2.0.8	2018-02-18 12:16:31 +01:00
Matthew Honnibal	97a228a4ce	Increment to v2.0.8.dev0	2018-02-17 16:54:36 +01:00
Matthew Honnibal	ebe84e45e5	Increment version to 2.0.7	2018-02-02 03:39:16 +01:00
Matthew Honnibal	e4b1f57599	Increment version	2018-02-02 02:33:23 +01:00
Matthew Honnibal	a437ba87a3	Set release=True	2018-01-29 21:26:04 +01:00
Matthew Honnibal	cbdab75b36	Increment version	2018-01-28 23:46:22 +01:00
Matthew Honnibal	a6b43729c6	Set version to v2.0.5	2017-12-07 10:39:14 +01:00
Matthew Honnibal	6373d2580d	Increment version to v2.0.5.dev0	2017-12-07 09:53:59 +01:00
Matthew Honnibal	05f41ff587	Set version to 2.0.4	2017-12-06 13:24:02 +01:00
Matthew Honnibal	04650e38c7	Set version to 2.0.4.dev0	2017-12-05 10:52:31 +01:00
Matthew Honnibal	b60d92aca8	Increment version	2017-11-15 16:14:46 +01:00
Matthew Honnibal	cf0be62096	Increment version	2017-11-15 15:00:18 +01:00
Matthew Honnibal	49fd5a646f	Set version for 2.0.2 release	2017-11-08 22:39:39 +01:00
Matthew Honnibal	fba2dbddf7	Increment version	2017-11-08 22:19:08 +01:00
Matthew Honnibal	e262e8d942	Increment version to v2.0.2.dev0	2017-11-08 11:25:47 +01:00
Matthew Honnibal	d725aee4e2	Increment version to 2.0.1	2017-11-08 02:14:47 +01:00
Matthew Honnibal	8d6f68f1df	Increment version	2017-11-08 01:12:34 +01:00
Matthew Honnibal	bbd2a3dee1	Fix title in about.py	2017-11-07 14:02:58 +01:00
Matthew Honnibal	4efaf9306c	Set version to spacy-nightly rc2	2017-11-07 13:27:26 +01:00
ines	834f9c1aab	Update about.py	2017-11-07 13:11:33 +01:00
ines	a4662a31a9	Move model package templates to cli.package and update docs	2017-11-07 12:15:35 +01:00
ines	a09c096d3c	Get docs ready for v2.0.0	2017-11-07 12:00:43 +01:00
Matthew Honnibal	174abe4677	Increment to 2.0.0rc1	2017-11-07 01:59:46 +01:00
Matthew Honnibal	8e6795437b	Set release=True	2017-11-06 16:39:32 +01:00
Matthew Honnibal	6f438b17c1	Increment version to v2.0.0a19	2017-11-05 14:43:36 +01:00
ines	6c2d8d3b2a	Use shortcuts-nightly.json to resolve model shortcuts	2017-10-29 01:28:31 +02:00
Matthew Honnibal	d9bb1e5de8	Increment version	2017-10-24 17:06:19 +02:00
Matthew Honnibal	f2b590f672	Increment version	2017-10-07 19:01:01 +02:00
Matthew Honnibal	d903986439	Increment version	2017-10-04 17:14:26 +02:00
ines	b0dfa059db	Update docs link in about.py	2017-10-03 15:19:55 +02:00
Matthew Honnibal	20193371f5	Don't share CNN, to reduce complexities	2017-09-21 14:59:48 +02:00
Matthew Honnibal	d1518027a9	Increment version	2017-09-14 16:18:46 +02:00
Matthew Honnibal	e88a42e460	Increment version	2017-09-04 21:14:39 +02:00
Matthew Honnibal	e3ea6ee02b	Increment version	2017-09-02 15:17:01 +02:00
ines	40afa13a8a	Increment version	2017-08-26 18:30:49 +02:00
Matthew Honnibal	8cfeeb4884	Increment version	2017-08-19 19:52:58 +02:00
Matthew Honnibal	d9f82f6b50	Increment version	2017-08-14 14:55:26 +02:00
ines	65bf80302c	Increment version	2017-08-14 13:04:30 +02:00
Matthew Honnibal	ae1ad81069	Increment version	2017-08-05 18:09:32 +02:00
Matthew Honnibal	aff325b7e0	Increment version	2017-07-25 19:41:20 +02:00
Matthew Honnibal	fd20a4af55	Increment version	2017-07-25 18:58:34 +02:00
Matthew Honnibal	5771bd1ff8	Increment version	2017-07-23 14:18:38 +02:00
Matthew Honnibal	f5de8deeec	Increment version	2017-07-22 20:04:53 +02:00
Matthew Honnibal	374ab3ecfb	Increment alpha version	2017-07-22 00:32:49 +02:00
ines	045574a936	Update package name and increment version	2017-06-05 20:41:30 +02:00
ines	c4614c02a2	Fix dev resources URL	2017-06-04 15:45:50 +02:00
ines	90d117f378	Update version	2017-06-04 13:41:16 +02:00
ines	d5c8d2f5fd	Update about.py and increment version	2017-06-01 11:52:24 +02:00
ines	9e83a17e95	Use new model templates	2017-05-29 15:27:24 +02:00
ines	9d85cda8e4	Fix models error message and use about.__docs_models__ (see #1051 )	2017-05-13 13:05:47 +02:00
ines	957ba676b4	Add model files base path to about.py	2017-05-07 23:22:35 +02:00
Matthew Honnibal	f0e1606d27	Increment version	2017-04-26 20:25:41 +02:00
ines	527d51ac9a	Fetch shortcuts from GitHub and improve error handling	2017-04-26 18:00:28 +02:00
Matthew Honnibal	e033c86a64	Increment version	2017-04-23 21:03:43 +02:00
ines	16a8521efa	Increment version	2017-04-16 22:38:38 +02:00
Matthew Honnibal	4931c56afc	Increment version	2017-04-16 13:59:38 -05:00
Matthew Honnibal	e26577b202	Increment version	2017-04-07 18:45:06 +02:00
Matthew Honnibal	40bf7ecf27	Increment version	2017-04-07 18:44:20 +02:00
Matthew Honnibal	df83921f0a	Increment version	2017-03-26 09:27:32 -05:00
Matthew Honnibal	f314d3d044	Increment version	2017-03-20 12:58:24 +01:00
Matthew Honnibal	6ee2ea1128	Increment version	2017-03-19 01:40:52 +01:00
ines	8a34c3e666	Fix shortcut name	2017-03-17 20:07:34 +01:00
ines	279b1d1965	Update version	2017-03-17 12:43:08 +01:00
ines	8af4b9e4df	Fix compatibility.json link	2017-03-17 12:43:03 +01:00
ines	2f0db1dd36	Use small English model as default	2017-03-16 09:54:40 +01:00
ines	58b884b6d4	Refactor download script and about.py to use new download method	2017-03-15 17:37:18 +01:00
Ines Montani	7e36568d5b	Fix title to accommodate sputnik	2017-01-17 00:51:09 +01:00
Ines Montani	64e142f460	Update about.py	2017-01-16 14:23:08 +01:00
Matthew Honnibal	e889cd698e	Increment version	2017-01-16 14:01:35 +01:00
Matthew Honnibal	7ccf490c73	Increment version	2017-01-16 13:17:58 +01:00
Matthew Honnibal	d1d8214767	Increment version	2017-01-12 11:21:57 +01:00
Matthew Honnibal	f62db78dc3	Increment version	2016-12-27 21:11:22 +01:00
Matthew Honnibal	5a6328a5a4	Increment version	2016-12-18 23:19:19 +01:00
Matthew Honnibal	0c0f4c965d	Increment version	2016-12-03 11:16:52 +01:00
Matthew Honnibal	42b0736db7	Increment version	2016-11-04 20:04:21 +01:00
Matthew Honnibal	9f93386994	Update version	2016-11-04 19:28:16 +01:00
Matthew Honnibal	6b9237aa83	Increment version	2016-10-23 20:22:53 +02:00
Matthew Honnibal	90f7544edd	Increment version	2016-10-23 19:43:06 +02:00
Matthew Honnibal	ca8ea33abc	Bump version to 1.1.0	2016-10-21 16:30:57 +02:00
Matthew Honnibal	147373c807	Increment version	2016-10-21 00:00:03 +02:00

1 2 3 4 5 ...

273 Commits