spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-09-25 05:26:44 +03:00

Author	SHA1	Message	Date
Matthew Honnibal	4b2e5e59ed	Add flush_cache method to tokenizer, to fix #1061 The tokenizer caches output for common chunks, for efficiency. This cache is be invalidated when the tokenizer rules change, e.g. when a new special-case rule is introduced. That's what was causing #1061. When the cache is flushed, we free the intermediate token chunks. I think this is safe --- but if we start getting segfaults, this patch is to blame. The resolution would be to simply not free those bits of memory. They'll be freed when the tokenizer exits anyway.	2017-07-22 15:06:50 +02:00
Ines Montani	96df9c7154	Update CONTRIBUTORS.md	2017-07-22 15:05:46 +02:00
ines	b22b18a019	Add notes on spacy.explain() to annotation docs	2017-07-22 15:02:15 +02:00
Ines Montani	1ddbeddca2	Fix typo	2017-07-22 15:00:58 +02:00
ines	e3f23f9d91	Use latest available version in examples	2017-07-22 14:57:51 +02:00
Matthew Honnibal	23a55b40ca	Default to English noun chunks iterator if no lang set	2017-07-22 14:15:25 +02:00
Matthew Honnibal	9750a0128c	Fix Span.noun_chunks. Closes #1207	2017-07-22 14:14:57 +02:00
Matthew Honnibal	d9b85675d7	Rename regression test	2017-07-22 14:14:35 +02:00
Matthew Honnibal	dfbc7e49de	Add test for Issue #1207	2017-07-22 14:14:01 +02:00
Matthew Honnibal	0ae3807d7d	Fix gaps in Lexeme API. Closes #1031	2017-07-22 13:53:48 +02:00
Matthew Honnibal	83e1b5f1e3	Merge branch 'master' of https://github.com/explosion/spaCy	2017-07-22 13:45:35 +02:00
Matthew Honnibal	45f6961ae0	Add __version__ symbol in __init__.py	2017-07-22 13:45:21 +02:00
Matthew Honnibal	8b9c4c5e1c	Add missing SP symbol to tag map, re #1052	2017-07-22 13:44:17 +02:00
Ines Montani	69396dcfd3	Update CONTRIBUTORS.md	2017-07-22 13:43:15 +02:00
Ines Montani	9af04ea11f	Merge pull request #1161 from AlexisEidelman/patch-1 French NUM_WORDS and ORDINAL_WORDS	2017-07-22 13:40:46 +02:00
Matthew Honnibal	8b581fdac5	Remove unused example	2017-07-22 13:36:54 +02:00
Matthew Honnibal	44dd247e73	Merge branch 'master' of https://github.com/explosion/spaCy	2017-07-22 13:35:30 +02:00
Matthew Honnibal	94267ec50f	Fix merge conflit in printer	2017-07-22 13:35:15 +02:00
Ines Montani	c7708dc736	Merge pull request #1177 from swierh/master Dutch NUM_WORDS and ORDINAL_WORDS	2017-07-22 13:35:08 +02:00
Matthew Honnibal	5916d46ba8	Avoid use of deepcopy in printer	2017-07-22 13:34:01 +02:00
Matthew Honnibal	a405660068	Add commit to tagger example	2017-07-22 13:32:48 +02:00
Matthew Honnibal	3fef5f642b	Rename tagger training example	2017-07-22 13:29:15 +02:00
Matthew Honnibal	8bb443be4f	Add standalone tagger training example	2017-07-22 13:28:51 +02:00
Matthew Honnibal	d6a5c2c85a	Add test for NER	2017-07-22 01:48:58 +02:00
Matthew Honnibal	28244df4da	Add test for beam parsing	2017-07-22 01:48:35 +02:00
Matthew Honnibal	c86445bdfd	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-07-22 01:14:28 +02:00
Matthew Honnibal	b3a749610e	Fix name of TextCategorizer	2017-07-22 01:14:07 +02:00
Matthew Honnibal	2424493970	Remove unnecessary import of Mock	2017-07-22 01:13:54 +02:00
Matthew Honnibal	baa3d81c35	Add text categorizer to Language	2017-07-22 01:13:36 +02:00
Matthew Honnibal	a6a2159969	Add slot for text categories to Doc	2017-07-22 00:34:15 +02:00
Matthew Honnibal	374ab3ecfb	Increment alpha version	2017-07-22 00:32:49 +02:00
Ines Montani	7c66691790	Merge pull request #1197 from jsparedes/patch-1 Fix url broken	2017-07-21 14:05:26 +02:00
Matthew Honnibal	289f23df51	Test beam parsing	2017-07-20 15:03:10 +02:00
Matthew Honnibal	3da1063b36	Add beam decoding to parser, to allow NER uncertainties	2017-07-20 15:02:55 +02:00
Matthew Honnibal	0ca5832427	Improve negative example handling in NER oracle	2017-07-20 00:18:49 +02:00
Matthew Honnibal	a231b56d40	Add text-classification hook to pipeline	2017-07-20 00:18:15 +02:00
Matthew Honnibal	7ea50182a5	Add support for text-classification labels to GoldParse	2017-07-20 00:17:47 +02:00
Matthew Honnibal	727481377e	Add text-classifer thinc models	2017-07-20 00:17:17 +02:00
Matthew Honnibal	f014138c11	Fix parser tests	2017-07-20 00:16:52 +02:00
Jorge Paredes	fadacd0d47	Fix url broken The related url to custom named entities was broken	2017-07-16 10:06:32 -05:00
Ines Montani	2d22b63e09	Merge pull request #1186 from lgenerknol/master .../cli/#foo is 404	2017-07-13 17:33:55 +02:00
lgenerknol	2b219caf0d	.../cli/#foo is 404 https://spacy.io/docs/usage/cli/#package is a 404. Changed to https://spacy.io/docs/usage/cli#package Definitely a larger fix possible to deal with trailing slashes	2017-07-12 13:12:24 -04:00
Ines Montani	d79fa8743a	Merge pull request #1185 from lgenerknol/master Missing markup char	2017-07-12 17:27:42 +02:00
lgenerknol	6cf2690943	Missing markup char Frontend displayed: ``` If start_idx and do not mark[...] ``` Note the missing "end_idx" after 'and'.	2017-07-12 11:06:16 -04:00
Ines Montani	9eca6503c1	Merge pull request #1157 from polm/master Add basic Japanese Tokenizer Test	2017-07-10 13:07:11 +02:00
Paul O'Leary McCann	bc87b815cc	Add comment clarifying what LANGUAGES does	2017-07-09 16:28:55 +09:00
Paul O'Leary McCann	04e6a65188	Remove Japanese from LANGUAGES LANGUAGES is a list of languages whose tokenizers get run through a variety of generic tests. Since the generic tests don't check the JA fixture, it blows up when it can't find janome. -POLM	2017-07-09 16:23:26 +09:00
Ines Montani	2b9411bb54	Merge pull request #1181 from val314159/patch-1 make this work in python2.7	2017-07-08 00:15:47 +02:00
val314159	19d4706f69	make this work in python2.7	2017-07-07 13:18:17 -07:00
Swier	29720150f9	fix import of stop words in language data	2017-07-05 14:08:04 +02:00

... 37 38 39 40 41 ...

8020 Commits