spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-07-31 10:29:46 +03:00

Author	SHA1	Message	Date
Matthew Honnibal	a405660068	Add commit to tagger example	2017-07-22 13:32:48 +02:00
Matthew Honnibal	3fef5f642b	Rename tagger training example	2017-07-22 13:29:15 +02:00
Matthew Honnibal	8bb443be4f	Add standalone tagger training example	2017-07-22 13:28:51 +02:00
Matthew Honnibal	d6a5c2c85a	Add test for NER	2017-07-22 01:48:58 +02:00
Matthew Honnibal	28244df4da	Add test for beam parsing	2017-07-22 01:48:35 +02:00
Matthew Honnibal	c86445bdfd	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-07-22 01:14:28 +02:00
Matthew Honnibal	b3a749610e	Fix name of TextCategorizer	2017-07-22 01:14:07 +02:00
Matthew Honnibal	2424493970	Remove unnecessary import of Mock	2017-07-22 01:13:54 +02:00
Matthew Honnibal	baa3d81c35	Add text categorizer to Language	2017-07-22 01:13:36 +02:00
Matthew Honnibal	a6a2159969	Add slot for text categories to Doc	2017-07-22 00:34:15 +02:00
Matthew Honnibal	374ab3ecfb	Increment alpha version	2017-07-22 00:32:49 +02:00
Ines Montani	7c66691790	Merge pull request #1197 from jsparedes/patch-1 Fix url broken	2017-07-21 14:05:26 +02:00
Matthew Honnibal	289f23df51	Test beam parsing	2017-07-20 15:03:10 +02:00
Matthew Honnibal	3da1063b36	Add beam decoding to parser, to allow NER uncertainties	2017-07-20 15:02:55 +02:00
Matthew Honnibal	0ca5832427	Improve negative example handling in NER oracle	2017-07-20 00:18:49 +02:00
Matthew Honnibal	a231b56d40	Add text-classification hook to pipeline	2017-07-20 00:18:15 +02:00
Matthew Honnibal	7ea50182a5	Add support for text-classification labels to GoldParse	2017-07-20 00:17:47 +02:00
Matthew Honnibal	727481377e	Add text-classifer thinc models	2017-07-20 00:17:17 +02:00
Matthew Honnibal	f014138c11	Fix parser tests	2017-07-20 00:16:52 +02:00
Jorge Paredes	fadacd0d47	Fix url broken The related url to custom named entities was broken	2017-07-16 10:06:32 -05:00
Ines Montani	2d22b63e09	Merge pull request #1186 from lgenerknol/master .../cli/#foo is 404	2017-07-13 17:33:55 +02:00
lgenerknol	2b219caf0d	.../cli/#foo is 404 https://spacy.io/docs/usage/cli/#package is a 404. Changed to https://spacy.io/docs/usage/cli#package Definitely a larger fix possible to deal with trailing slashes	2017-07-12 13:12:24 -04:00
Ines Montani	d79fa8743a	Merge pull request #1185 from lgenerknol/master Missing markup char	2017-07-12 17:27:42 +02:00
lgenerknol	6cf2690943	Missing markup char Frontend displayed: ``` If start_idx and do not mark[...] ``` Note the missing "end_idx" after 'and'.	2017-07-12 11:06:16 -04:00
Ines Montani	9eca6503c1	Merge pull request #1157 from polm/master Add basic Japanese Tokenizer Test	2017-07-10 13:07:11 +02:00
Paul O'Leary McCann	bc87b815cc	Add comment clarifying what LANGUAGES does	2017-07-09 16:28:55 +09:00
Paul O'Leary McCann	04e6a65188	Remove Japanese from LANGUAGES LANGUAGES is a list of languages whose tokenizers get run through a variety of generic tests. Since the generic tests don't check the JA fixture, it blows up when it can't find janome. -POLM	2017-07-09 16:23:26 +09:00
Ines Montani	2b9411bb54	Merge pull request #1181 from val314159/patch-1 make this work in python2.7	2017-07-08 00:15:47 +02:00
val314159	19d4706f69	make this work in python2.7	2017-07-07 13:18:17 -07:00
Swier	29720150f9	fix import of stop words in language data	2017-07-05 14:08:04 +02:00
Swier	f377c9c952	Rename stop_words.py to word_sets.py	2017-07-05 14:06:28 +02:00
Swier	5357874bf7	add Dutch numbers and ordinals	2017-07-05 14:03:30 +02:00
mollerhoj	85144835da	Add Tag_map for Danish	2017-07-03 15:52:55 +02:00
mollerhoj	64c732918a	Add Morph_rules. (TODO: Not working?)	2017-07-03 15:52:55 +02:00
mollerhoj	3b2cb107a3	Add like_num functionality to Danish	2017-07-03 15:49:51 +02:00
mollerhoj	e8f40ceed8	Add short names of months to tokenizer_exceptions	2017-07-03 15:49:51 +02:00
mollerhoj	e840077601	Add some basic tests for Danish	2017-07-03 15:49:51 +02:00
mollerhoj	23025d3b05	Clean up a couple of strange English stopwords	2017-07-03 15:41:59 +02:00
mollerhoj	dc5be7d2f3	Cleanup list of Danish stopwords	2017-07-03 15:40:58 +02:00
Raphaël Bournhonesque	8592f3de47	Fix fuzzy unit tests	2017-07-01 15:03:32 +02:00
Raphaël Bournhonesque	f4748834d9	Use spacy hash_string function instead of md5	2017-07-01 13:17:26 +02:00
Raphaël Bournhonesque	c3d722d66f	Add a disclaimer about classes copied from the Jinja2 project	2017-07-01 13:09:56 +02:00
Ines Montani	84eb9d6bd3	Merge pull request #1167 from callumkift/fix/docs-ner-training Fixed error training NER documentation and example	2017-07-01 11:46:31 +02:00
Ines Montani	c91642efd5	Port over changes from #1168	2017-07-01 11:43:54 +02:00
Ines Montani	0c7f5af5ee	Merge pull request #1168 from gispk47/master Update zh language error	2017-07-01 11:43:12 +02:00
gispk47	669bd14213	Update __init__.py remove the empty string return from jieba.cut,this will cause the list of tokens cant be pushed assert error	2017-07-01 13:12:00 +08:00
Callum Kift	dfaeee1f37	fixed bug in training ner documentation and example	2017-06-30 09:56:33 +02:00
Paul O'Leary McCann	c336193392	Parametrize and extend Japanese tokenizer tests	2017-06-29 00:09:40 +09:00
Paul O'Leary McCann	30a34ebb6e	Add importorskip for janome	2017-06-29 00:09:20 +09:00
Alexis	1b3a5d87ba	French NUM_WORDS and ORDINAL_WORDS	2017-06-28 14:11:20 +02:00

... 33 34 35 36 37 ...

7800 Commits