spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-09-23 20:46:44 +03:00

Author	SHA1	Message	Date
Orion Montoya	e04e11070f	Contributor agreement for Orion Montoya @mdcclv	2017-10-05 17:45:45 -04:00
Ines Montani	e77d8886f7	Update CONTRIBUTORS.md	2017-10-05 22:22:04 +02:00
Matthew Honnibal	dea81f113d	Merge pull request #1389 from mdcclv/lemmatizer_obey_exceptions Lemmatizer obey exceptions	2017-10-05 22:11:21 +02:00
Orion Montoya	b0d271809d	Unit test for lemmatizer exceptions -- copied from regression test for #1387	2017-10-05 10:49:28 -04:00
Orion Montoya	ffb50d21a0	Lemmatizer honors exceptions: Fix #1387	2017-10-05 10:49:02 -04:00
Orion Montoya	e81a608173	Regression test for lemmatizer exceptions -- demonstrate issue #1387	2017-10-05 10:47:48 -04:00
Ines Montani	678651ca98	Merge pull request #1386 from kokes/patch-1 Fixing links to SyntaxNet	2017-10-04 13:35:01 +02:00
Ondrej Kokes	a9362f1c73	Fixing links to SyntaxNet	2017-10-04 12:55:07 +02:00
Matthew Honnibal	eb72eae258	Merge pull request #1364 from Destygo/master Fixed NER model loading bug	2017-09-29 12:29:43 +02:00
Ines Montani	58bfe30a12	Merge pull request #1362 from IamJeffG/docs/custom-tokenizer Document Tokenizer(token_match) and clarify tokenizer_pseudo_code	2017-09-26 15:51:15 +02:00
Vincent Genty	259ed027af	Fixed NER model loading bug	2017-09-26 15:46:04 +02:00
Ines Montani	361211fe26	Merge pull request #1342 from wannaphongcom/master Add Thai language	2017-09-26 15:40:55 +02:00
Jeffrey Gerard	b6ebedd09c	Document Tokenizer(token_match) and clarify tokenizer_pseudo_code Closes #835 In the `tokenizer_pseudo_code` I put the `special_cases` kwarg before `find_prefix` because this now matches the order the args are used in the pseudocode, and it also matches spacy's actual code.	2017-09-25 13:13:25 -07:00
Matthew Honnibal	2f8d535f65	Merge pull request #1351 from hscspring/patch-4 Update punctuation.py	2017-09-24 12:16:39 +02:00
Matthew Honnibal	9177313063	Merge pull request #1352 from hscspring/patch-5 Update customizing-tokenizer.jade	2017-09-22 16:11:49 +02:00
Matthew Honnibal	1dbc2285b8	Merge pull request #1350 from hscspring/patch-3 Update word-vectors-similarities.jade	2017-09-22 16:11:05 +02:00
Yam	54855f0eee	Update customizing-tokenizer.jade	2017-09-22 12:15:48 +08:00
Yam	6f450306c3	Update customizing-tokenizer.jade update some codes: - `me` -> `-PRON` - `TAG` -> `POS` - `create_tokenizer` function	2017-09-22 10:53:22 +08:00
Yam	923c4c2fb2	Update punctuation.py add `……`	2017-09-22 09:50:46 +08:00
Yam	425c09488d	Update word-vectors-similarities.jade add ``` import spacy nlp = spacy.load('en') ```	2017-09-22 08:56:34 +08:00
Wannaphong Phatthiyaphaibun	1abf472068	add th test	2017-09-21 12:56:58 +07:00
Matthew Honnibal	ea2732469b	Merge pull request #1340 from hscspring/patch-1 Update punctuation.py	2017-09-20 23:57:00 +02:00
Wannaphong Phatthiyaphaibun	39bb5690f0	update th	2017-09-21 00:36:02 +07:00
Wannaphong Phatthiyaphaibun	44291f6697	add thai	2017-09-20 23:26:34 +07:00
Yam	978b24ccd4	Update punctuation.py In Chinese, `~` and `——` is hyphens, `·` is intermittent symbol	2017-09-20 23:02:22 +08:00
Matthew Honnibal	aa728b33ca	Merge pull request #1333 from galaxyh/master Add Chinese punctuation	2017-09-19 15:09:30 +02:00
Yu-chun Huang	188b439b25	Add Chinese punctuation Add Chinese punctuation.	2017-09-19 16:58:42 +08:00
Yu-chun Huang	1f1f35dcd0	Add Chinese punctuation Add Chinese punctuation.	2017-09-19 16:57:24 +08:00
Ines Montani	4bee26188d	Merge pull request #1323 from galaxyh/master Set the "cut_all" parameter in jieba.cut() to False, or jieba will return ALL POSSIBLE word segmentations.	2017-09-14 15:23:41 +02:00
Yu-chun Huang	7692b8c071	Update __init__.py Set the "cut_all" parameter to False, or jieba will return ALL POSSIBLE word segmentations.	2017-09-12 16:23:47 +08:00
Matthew Honnibal	ddaff6ca56	Merge pull request #1287 from IamJeffG/feature/1226-more-complete-noun-chunks Capture more noun chunks	2017-09-08 07:59:10 +02:00
Matthew Honnibal	45029a550e	Fix customized-tokenizer tests	2017-09-04 20:13:13 +02:00
Matthew Honnibal	34c585396a	Merge pull request #1294 from Vimos/master Fix issue #1292 and add test case for the Assertion Error	2017-09-04 19:20:40 +02:00
Matthew Honnibal	c68f188eb0	Fix error on test	2017-09-04 18:59:36 +02:00
Matthew Honnibal	33313c01ad	Merge pull request #1298 from ericzhao28/master Lowest common ancestor matrix for spans and docs	2017-09-04 18:57:54 +02:00
Matthew Honnibal	e8a26ebfab	Add efficiency note to new get_lca_matrix() method	2017-09-04 15:43:52 +02:00
Eric Zhao	d61c117081	Lowest common ancestor matrix for spans and docs Added functionality for spans and docs to get lowest common ancestor matrix by simply calling: doc.get_lca_matrix() or doc[:3].get_lca_matrix(). Corresponding unit tests were also added under spacy/tests/doc and spacy/tests/spans. Designed to address: https://github.com/explosion/spaCy/issues/969.	2017-09-03 12:22:19 -07:00
Matthew Honnibal	9bffcaa73d	Update test to make it slightly more direct The `nlp` container should be unnecessary here. If so, we can test the tokenizer class just a little more directly.	2017-09-01 21:16:56 +02:00
Vimos Tan	a6d9fb5bb6	fix issue #1292	2017-08-30 14:49:14 +08:00
Jeffrey Gerard	884ba168a8	Capture more noun chunks	2017-08-23 21:18:53 -07:00
ines	dcff10abe9	Add regression test for #1281	2017-08-21 16:11:47 +02:00
ines	edc596d9a7	Add missing tokenizer exceptions (resolves #1281 )	2017-08-21 16:11:36 +02:00
ines	c5c3f4c7d9	Use more generous .env ignore rule	2017-08-21 16:08:40 +02:00
Ines Montani	dca026124f	Merge pull request #1262 from kevinmarsh/patch-1 Fix broken tutorial link on website	2017-08-16 09:58:07 +02:00
Kevin Marsh	e3738aba0d	Fix broken tutorial link on website	2017-08-15 21:50:09 +01:00
Ines Montani	a9465271a7	Merge pull request #1245 from delirious-lettuce/fix_typos Fix typos	2017-08-07 23:11:20 +02:00
Delirious Lettuce	d3b03f0544	Fix typos: * `auxillary` -> `auxiliary` * `consistute` -> `constitute` * `earlist` -> `earliest` * `prefered` -> `preferred` * `direcory` -> `directory` * `reuseable` -> `reusable` * `idiosyncracies` -> `idiosyncrasies` * `enviroment` -> `environment` * `unecessary` -> `unnecessary` * `yesteday` -> `yesterday` * `resouces` -> `resources`	2017-08-06 21:31:39 -06:00
Matthew Honnibal	b7b121103f	Merge pull request #1244 from gideonite/patch-1 improve pipe, tee, izip explanation	2017-08-06 14:34:07 +02:00
Gideon Dresdner	7e98a3613c	improve pipe, tee, izip explanation Use an example from an old issue https://github.com/explosion/spaCy/issues/172#issuecomment-183963403.	2017-08-06 13:21:45 +02:00
ines	864cefd3b2	Update README.rst	2017-07-22 18:29:55 +02:00

1 2 3 4 5 ...

5213 Commits