spaCy

mirror of https://github.com/explosion/spaCy.git synced 2026-02-08 16:29:45 +03:00

Author	SHA1	Message	Date
Ines Montani	f1653d281f	Fix and update universe.json [ci skip]	2020-07-07 21:12:56 +02:00
Jonathan Besomi	f904f1f361	Add texthero to universe.json (#5716 ) * Add texthero to universe.json * Add spaCy contributor Agreement	2020-07-07 20:57:45 +02:00
gandersen101	9cfd294e59	Adding spaczz package to universe.json (#5717 ) * Adding spaczz package to universe.json * Adding contributor agreement.	2020-07-07 20:57:36 +02:00
Álvaro Abella Bascarán	7111b9de2e	Fix in docs: pipe(docs) instead of pipe(texts) (#5680 ) Very minor fix in docs, specifically in this part: ``` matcher = PhraseMatcher(nlp.vocab) > for doc in matcher.pipe(texts, batch_size=50): > pass ``` `texts` suggests the input is an iterable of strings. I replaced it for `docs`.	2020-06-30 20:01:12 +02:00
Matthias Hertel	305221f3e5	Website: fixed the token span in the text about the rule-based matching example (#5669 ) * fixed token span in pattern matcher example * contributor agreement	2020-06-30 19:58:55 +02:00
Adriane Boyd	d777d9cc38	Extend v2.3 migration guide (#5653 ) * Extend preloaded vocab section * Add section on tag maps	2020-06-26 14:13:01 +02:00
Adriane Boyd	a2660bd9c6	Fix backslashes in warnings config diff (#5640 ) Fix backslashes in warnings config diff in v2.3 migration section.	2020-06-24 10:26:57 +02:00
Adriane Boyd	4f73ced914	Extend what's new in v2.3 with vocab / is_oov (#5635 )	2020-06-23 16:50:43 +02:00
Adriane Boyd	fcdecefacf	Add warnings example in v2.3 migration guide (#5627 )	2020-06-22 14:38:06 +02:00
Adriane Boyd	66889de166	Warning for sudachipy 0.4.5 (#5611 )	2020-06-19 13:45:23 +02:00
Ines Montani	959bc616dd	Merge branch 'master' into spacy.io	2020-06-16 22:50:11 +02:00
Ines Montani	6d712f3e06	Merge pull request #5599 from adrianeboyd/docs/v2.3.0-minor	2020-06-16 13:49:25 -07:00
Adriane Boyd	02369f91d3	Fix spacy convert argument	2020-06-16 20:41:17 +02:00
Adriane Boyd	f0fd77648f	Change example title to Dr. Change example title to Dr. so the current model does exclude the title in the initial example.	2020-06-16 20:36:21 +02:00
Adriane Boyd	a6abdfbc3c	Fix numpy.zeros() dtype for Doc.from_array	2020-06-16 20:35:45 +02:00
Adriane Boyd	9aff317ca7	Update POS in tagging example	2020-06-16 20:26:57 +02:00
Adriane Boyd	457babfa0c	Update alignment example for new gold.align	2020-06-16 20:22:03 +02:00
Ines Montani	19b9ea0436	Fix languages.json	2020-06-16 18:34:11 +02:00
Ines Montani	ed240458f6	Try and upgrade gatsby	2020-06-16 18:28:24 +02:00
Ines Montani	41003a5117	Update Binder version [ci skip]	2020-06-16 17:41:23 +02:00
Ines Montani	fd89f44c0c	Update Binder URL [ci skip]	2020-06-16 17:34:26 +02:00
Ines Montani	44af53bdd9	Add pkuseg warnings and auto-format [ci skip]	2020-06-16 17:13:35 +02:00
Ines Montani	a9e5b840ee	Fix typos and auto-format [ci skip]	2020-06-16 16:38:45 +02:00
Ines Montani	e9d3e177f0	Merge branch 'master' into v2.3.x	2020-06-16 16:31:38 +02:00
Ines Montani	bb54f54369	Fix model accuracy table [ci skip]	2020-06-16 16:10:12 +02:00
Adriane Boyd	d5110ffbf2	Documentation updates for v2.3.0 (#5593 ) * Update website models for v2.3.0 * Add docs for Chinese word segmentation * Tighten up Chinese docs section * Merge branch 'master' into docs/v2.3.0 [ci skip] * Merge branch 'master' into docs/v2.3.0 [ci skip] * Auto-format and update version * Update matcher.md * Update languages and sorting * Typo in landing page * Infobox about token_match behavior * Add meta and basic docs for Japanese * POS -> TAG in models table * Add info about lookups for normalization * Updates to API docs for v2.3 * Update adding norm exceptions for adding languages * Add --omit-extra-lookups to CLI API docs * Add initial draft of "What's New in v2.3" * Add new in v2.3 tags to Chinese and Japanese sections * Add tokenizer to migration section * Add new in v2.3 flags to init-model * Typo * More what's new in v2.3 Co-authored-by: Ines Montani <ines@ines.io>	2020-06-16 15:37:35 +02:00
Martino Mensio	de00f967ce	adding spacy-universal-sentence-encoder (#5534 ) * adding spacy-universal-sentence-encoder * update affiliation * updated code example	2020-06-08 20:26:30 +02:00
Sofie Van Landeghem	4d1ba6feb4	add tag variant for 2.3 (#5542 )	2020-06-04 19:16:33 +02:00
svlandeg	5f0a91cf37	fix conv-depth parameter	2020-05-29 09:56:29 +02:00
Rajat	8b8efa1b42	update spacy universe with my project (#5497 ) * added contextualSpellCheck in spacy universe meta * removed extra formatting by code * updated with permanent links * run json linter used by spacy * filled SCA * updated the description	2020-05-25 11:30:23 +02:00
Sofie Van Landeghem	ae1c179f3a	Remove the nested quote	2020-05-23 17:58:19 +02:00
Jannis	aa53ce6996	Documentation Typo Fix (#5492 ) * Fix typo Change 'realize' to 'realise' * Add contributer agreement	2020-05-22 19:50:26 +02:00
Matthew Honnibal	f6078d866a	Merge pull request #5121 from adrianeboyd/bugfix/revert-token-match Revert token_match priority changes from #4374 and extend token match options	2020-05-22 14:42:51 +02:00
Ines Montani	65c7e82de2	Auto-format and remove 2.3 feature [ci skip]	2020-05-22 13:50:30 +02:00
Adriane Boyd	e4a1b5dab1	Rename to url_match Rename to `url_match` and update docs.	2020-05-22 12:41:03 +02:00
Adriane Boyd	730fa493a4	Merge remote-tracking branch 'upstream/master' into bugfix/revert-token-match	2020-05-22 12:18:00 +02:00
Ines Montani	ee027de032	Update universe and display of videos [ci skip]	2020-05-21 21:54:23 +02:00
Ines Montani	53da6bd672	Add course to landing [ci skip]	2020-05-21 20:45:33 +02:00
Kevin Lu	c7c4cd5fe1	Changed pyate code example in universe.json	2020-05-20 09:11:32 -07:00
Kevin Lu	0a5b140235	Update universe.json	2020-05-19 20:12:21 -07:00
Ines Montani	f333c2a011	Merge pull request #5386 from svlandeg/fix/nel-docs	2020-05-10 12:00:09 +02:00
Travis Hoppe	d4cc18b746	Added author information for NLPre (#5414 ) * Add author links for NLPre and update category * Add contributor statement	2020-05-08 11:28:54 +02:00
adrianeboyd	4a15b559ba	Clarify Token.pos as UPOS (#5419 )	2020-05-08 10:36:25 +02:00
adrianeboyd	a2345618f1	Fix Token API docs from #5375 (#5418 )	2020-05-08 10:25:02 +02:00
Adriane Boyd	565e0eef73	Add tokenizer option for token match with affixes To fix the slow tokenizer URL (#4374) and allow `token_match` to take priority over prefixes and suffixes by default, introduce a new tokenizer option for a token match pattern that's applied after prefixes and suffixes but before infixes.	2020-05-05 10:35:33 +02:00
Adriane Boyd	792c8af8cf	Merge remote-tracking branch 'upstream/master' into bugfix/revert-token-match	2020-05-05 09:25:57 +02:00
svlandeg	ebaed7dcfa	Few more updates to the EL documentation	2020-04-30 10:17:06 +02:00
adrianeboyd	bdff76dede	Various updates/additions to CLI scripts (#5362 ) * `debug-data`: determine coverage of provided vectors * `evaluate`: support `blank:lg` model to make it possible to just evaluate tokenization * `init-model`: add option to truncate vectors to N most frequent vectors from word2vec file * `train`: * if training on GPU, only run evaluation/timing on CPU in the first iteration * if training is aborted, exit with a non-0 exit status	2020-04-29 12:56:46 +02:00
Sofie Van Landeghem	cfdaf99b80	Fix passing of component configuration (#5374 ) * add kwargs to to_disk methods in docs - otherwise crashes on 'exclude' argument * add fix and test for Issue 5137	2020-04-29 12:56:17 +02:00
Ines Montani	63885c1836	Remove u string and auto-format [ci skip]	2020-04-29 12:54:57 +02:00

1 2 3 4 5 ...

1678 Commits