spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-09-23 20:46:44 +03:00

Author	SHA1	Message	Date
Adriane Boyd	ffaa0d6b9b	Fix Transformer.initialize example (#7963 )	2021-04-30 12:21:59 +02:00
Adriane Boyd	2320791f6d	Fix Transformer.initialize example (#7963 )	2021-04-30 12:21:31 +02:00
Adriane Boyd	cf032ec31e	Update to catalogue>=2.0.4 (#7951 )	2021-04-29 19:11:28 +02:00
Adriane Boyd	7cf5bd072f	Refactor util.to_ternary_int (#7944 ) * Refactor to avoid literal comparison with `is` * Extend tests	2021-04-29 16:58:54 +02:00
Sevdimali	49aed683cc	Azerbaijani language added (#7911 )	2021-04-28 14:42:02 +02:00
Adriane Boyd	f4080983ea	Extend to cupy 9.0.0 (#7914 )	2021-04-28 10:18:24 +02:00
Paul O'Leary McCann	8007d5c814	Check if the resume path points to a directory (#7919 ) This came up in #7878, but if --resume-path is a directory then loading the weights will fail. On Linux this will give a straightforward error message, but on Windows it gives "Permission Denied", which is confusing.	2021-04-28 09:17:15 +02:00
Paul O'Leary McCann	de6b5ed14d	Fix percent unk display in debug data (#7886 ) * Fix percent unk display This was showing (ratio %), so 10% would show as 0.10%. Fix by multiplying ration by 100. Might want to add a warning if this is over a threshold. * Only show whole-integer percents	2021-04-27 09:16:35 +02:00
Janis Klaise	b33fb9ac1e	Update load_lookups return type and docstring (#7907 ) * Update load_lookups return type and docstring * Add contributor agreement	2021-04-27 09:14:59 +02:00
Janis Klaise	1690595e4d	Update load_lookups return type and docstring (#7907 ) * Update load_lookups return type and docstring * Add contributor agreement	2021-04-27 09:13:39 +02:00
Adriane Boyd	946a4284be	Set spacy-legacy to >=3.0.5 (#7897 ) Set `spacy-legacy` to `>=3.0.5` due to `spacy.StaticVectors.v1` init bug.	2021-04-26 18:25:39 +02:00
Adriane Boyd	874cd02539	Set spacy-legacy to >=3.0.5 (#7897 ) Set `spacy-legacy` to `>=3.0.5` due to `spacy.StaticVectors.v1` init bug.	2021-04-26 17:06:32 +02:00
Adriane Boyd	ae855a4625	Clean up Morphology imports and definitions (#7441 ) * Clean up Morphology imports and definitions * Whitespace formatting	2021-04-26 16:54:23 +02:00
Adriane Boyd	ceee1ecf17	Replace cpdef variables with cdef (#7834 )	2021-04-26 16:54:02 +02:00
Adriane Boyd	95c0833656	Add training option to set annotations on update (#7767 ) * Add training option to set annotations on update Add a `[training]` option called `set_annotations_on_update` to specify a list of components for which the predicted annotations should be set on `example.predicted` immediately after that component has been updated. The predicted annotations can be accessed by later components in the pipeline during the processing of the batch in the same `update` call. * Rename to annotates / annotating_components * Add test for `annotating_components` when training from config * Add documentation	2021-04-26 16:53:53 +02:00
Jacopo Farina	c105ed10fd	Remove torino from stop words (#7634 ) Torino is the proper name of a city and the token has no other meaning	2021-04-26 16:53:43 +02:00
Sofie Van Landeghem	e0b29f8ef7	Fix scoring normalization (#7629 ) * fix scoring normalization * score weights by total sum instead of per component * cleanup * more cleanup	2021-04-26 16:53:38 +02:00
Sofie Van Landeghem	95e3cf576b	Optionally append lang for packaged model name (#7417 ) * Add empty lines at the end of Python files * Only prepend the lang code if it's not there already * Update spacy/cli/package.py * fix whitespace stripping	2021-04-26 16:53:21 +02:00
Adriane Boyd	29ac7f776a	Merge branch 'master' into spacy.io	2021-04-24 12:58:47 +02:00
Adriane Boyd	df3444421a	Update spacy-legacy to >=3.0.4 (#7865 )	2021-04-23 12:16:12 +02:00
Adriane Boyd	8a95475b3d	Set version to v3.0.6 (#7854 )	2021-04-22 16:33:26 +02:00
Adriane Boyd	36ecba224e	Set up GPU CI testing (#7293 ) * Set up CI for tests with GPU agent * Update tests for enabled GPU * Fix steps filename * Add parallel build jobs as a setting * Fix test requirements * Fix install test requirements condition * Fix pipeline models test * Reset current ops in prefer/require testing * Fix more tests * Remove separate test_models test * Fix regression 5551 * fix StaticVectors for GPU use * fix vocab tests * Fix regression test 5082 * Move azure steps to .github and reenable default pool jobs * Consolidate/rename azure steps Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>	2021-04-22 14:58:29 +02:00
Adriane Boyd	bdb485cc80	Add callback to copy vocab/tokenizer from model (#7750 ) * Add callback to copy vocab/tokenizer from model Add callback `spacy.copy_from_base_model.v1` to copy the tokenizer settings and/or vocab (including vectors) from a base model. * Move spacy.copy_from_base_model.v1 to spacy.training.callbacks * Add documentation * Modify to specify model as tokenizer and vocab params	2021-04-22 12:36:50 +02:00
Adriane Boyd	f68fc29130	Update sent_starts in Example.from_dict (#7847 ) * Update sent_starts in Example.from_dict Update `sent_starts` for `Example.from_dict` so that `Optional[bool]` values have the same meaning as for `Token.is_sent_start`. Use `Optional[bool]` as the type for sent start values in the docs. * Use helper function for conversion to ternary ints	2021-04-22 11:32:45 +02:00
Adriane Boyd	f4339f9bff	Fix tokenizer cache flushing (#7836 ) * Fix tokenizer cache flushing Fix/simplify tokenizer init detection in order to fix cache flushing when properties are modified. * Remove init reloading logic * Remove logic disabling `_reload_special_cases` on init * Setting `rules` last in `__init__` (as before) means that setting other properties doesn't reload any special cases * Reset `rules` first in `from_bytes` so that setting other properties during deserialization doesn't reload any special cases unnecessarily * Reset all properties in `Tokenizer.from_bytes` to allow any settings to be `None` * Also reset special matcher when special cache is flushed * Remove duplicate special case validation * Add test for special cases flushing * Extend test for tokenizer deserialization of None values	2021-04-22 18:14:57 +10:00
Sofie Van Landeghem	047d912904	fix typo in entity_linker docs	2021-04-22 10:10:31 +02:00
Sofie Van Landeghem	cfad7e21d5	fix config parsing of ints/strings (#7755 ) * add few failing tests for parsing integers and strings * bump thinc to 8.0.3	2021-04-22 18:09:13 +10:00
Adriane Boyd	d2bdaa7823	Replace negative rows with 0 in StaticVectors (#7674 ) * Replace negative rows with 0 in StaticVectors Replace negative row indices with 0-vectors in `StaticVectors`. * Increase versions related to StaticVectors * Increase versions of all architctures and layers related to `StaticVectors` * Improve efficiency of 0-vector operations Parallel `spacy-legacy` PR: https://github.com/explosion/spacy-legacy/pull/5 * Update config defaults to new versions * Update docs	2021-04-22 18:04:15 +10:00
Sofie Van Landeghem	6f565cf39d	fix typo in entity_linker docs	2021-04-22 09:59:24 +02:00
Sofie Van Landeghem	47bbc46392	update EL training data format in docs (#7839 ) * update EL training data format * fix typo * all -1 because reasons	2021-04-22 08:50:31 +02:00
Sofie Van Landeghem	2e746dbf32	update EL training data format in docs (#7839 ) * update EL training data format * fix typo * all -1 because reasons	2021-04-22 08:50:09 +02:00
meghanabhange	7985e6bb39	Project Idea : denomme \| Multilingual Name Detection (#7845 ) * Add denomme * spaCy contributor agreement Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2021-04-22 08:48:41 +02:00
meghanabhange	49ff1126bf	Project Idea : denomme \| Multilingual Name Detection (#7845 ) * Add denomme * spaCy contributor agreement Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2021-04-22 08:48:17 +02:00
Sam Edwardes	05c609cdeb	Added a logo to spaCyTextBlob (#7818 ) * Added a logo to spaCyTextBlob * Updated to better thumb	2021-04-22 08:42:14 +02:00
Sam Edwardes	b8c6c10c6f	Added a logo to spaCyTextBlob (#7818 ) * Added a logo to spaCyTextBlob * Updated to better thumb	2021-04-22 08:41:55 +02:00
Diego Palma	ac101cba00	Add TRUNAJOD to spaCy universe. (#7754 ) * Add TRUNAJOD to spaCy universe. * Add trunajod logo and thumb. Co-authored-by: Diego <dpalma@evernote.com>	2021-04-22 08:41:03 +02:00
Diego Palma	bbade153ed	Add TRUNAJOD to spaCy universe. (#7754 ) * Add TRUNAJOD to spaCy universe. * Add trunajod logo and thumb. Co-authored-by: Diego <dpalma@evernote.com>	2021-04-22 08:40:28 +02:00
Ines Montani	ee68dc260f	Auto-format [ci skip]	2021-04-22 10:58:18 +10:00
Ines Montani	a9e5ae9b5c	Auto-format [ci skip]	2021-04-22 10:58:05 +10:00
Ines Montani	3931fa146b	Merge branch 'spacy.io' of https://github.com/explosion/spaCy into spacy.io	2021-04-22 10:57:25 +10:00
Ines Montani	c3f7d33f8e	Merge pull request #7851 from plison/master [ci skip]	2021-04-22 10:57:08 +10:00
Pierre Lison	663a160867	adding skweak to the SpaCy universe	2021-04-22 10:57:08 +10:00
Pierre Lison	bb961a2c11	adding skweak to the SpaCy universe	2021-04-22 10:57:08 +10:00
Ines Montani	5cbe414ce6	Merge pull request #7851 from plison/master [ci skip]	2021-04-22 10:56:35 +10:00
Pierre Lison	2f0ef2c9cc	adding skweak to the SpaCy universe	2021-04-22 01:16:34 +02:00
Pierre Lison	debfb46088	adding skweak to the SpaCy universe	2021-04-22 00:58:09 +02:00
Shantam Raj	5aac993604	Default code for Setting Entity annotations on the website errors (#7738 ) * the default example for "Setting entity annotations" errors on Binder * updating contributer info * using a new variable to store original entities	2021-04-21 09:18:22 +02:00
Shantam Raj	6017fcf693	Default code for Setting Entity annotations on the website errors (#7738 ) * the default example for "Setting entity annotations" errors on Binder * updating contributer info * using a new variable to store original entities	2021-04-21 09:16:32 +02:00
Ines Montani	1c1087e4ff	Merge pull request #7826 from richardpaulhudson/master Add entry for Coreferee project to universe.json	2021-04-21 16:23:09 +10:00
hudsonr	1eaf6e5ccb	Added universe entry for Coreferee	2021-04-21 16:23:09 +10:00

... 12 13 14 15 16 ...

15224 Commits