spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-08-04 20:30:24 +03:00

Author	SHA1	Message	Date
Swier	f377c9c952	Rename stop_words.py to word_sets.py	2017-07-05 14:06:28 +02:00
Swier	5357874bf7	add Dutch numbers and ordinals	2017-07-05 14:03:30 +02:00
mollerhoj	85144835da	Add Tag_map for Danish	2017-07-03 15:52:55 +02:00
mollerhoj	64c732918a	Add Morph_rules. (TODO: Not working?)	2017-07-03 15:52:55 +02:00
mollerhoj	3b2cb107a3	Add like_num functionality to Danish	2017-07-03 15:49:51 +02:00
mollerhoj	e8f40ceed8	Add short names of months to tokenizer_exceptions	2017-07-03 15:49:51 +02:00
mollerhoj	e840077601	Add some basic tests for Danish	2017-07-03 15:49:51 +02:00
mollerhoj	23025d3b05	Clean up a couple of strange English stopwords	2017-07-03 15:41:59 +02:00
mollerhoj	dc5be7d2f3	Cleanup list of Danish stopwords	2017-07-03 15:40:58 +02:00
Raphaël Bournhonesque	8592f3de47	Fix fuzzy unit tests	2017-07-01 15:03:32 +02:00
Raphaël Bournhonesque	f4748834d9	Use spacy hash_string function instead of md5	2017-07-01 13:17:26 +02:00
Raphaël Bournhonesque	c3d722d66f	Add a disclaimer about classes copied from the Jinja2 project	2017-07-01 13:09:56 +02:00
Ines Montani	84eb9d6bd3	Merge pull request #1167 from callumkift/fix/docs-ner-training Fixed error training NER documentation and example	2017-07-01 11:46:31 +02:00
Ines Montani	c91642efd5	Port over changes from #1168	2017-07-01 11:43:54 +02:00
Ines Montani	0c7f5af5ee	Merge pull request #1168 from gispk47/master Update zh language error	2017-07-01 11:43:12 +02:00
gispk47	669bd14213	Update __init__.py remove the empty string return from jieba.cut,this will cause the list of tokens cant be pushed assert error	2017-07-01 13:12:00 +08:00
Callum Kift	dfaeee1f37	fixed bug in training ner documentation and example	2017-06-30 09:56:33 +02:00
Paul O'Leary McCann	c336193392	Parametrize and extend Japanese tokenizer tests	2017-06-29 00:09:40 +09:00
Paul O'Leary McCann	30a34ebb6e	Add importorskip for janome	2017-06-29 00:09:20 +09:00
Alexis	1b3a5d87ba	French NUM_WORDS and ORDINAL_WORDS	2017-06-28 14:11:20 +02:00
Jim O'Regan	70f4d26c10	bounds checks	2017-06-28 10:59:46 +01:00
Jim O'Regan	1ba38b2036	some helpers; the Irish part of UD only has 2500 sentences so this will need source of morphology	2017-06-28 00:42:00 +01:00
Jim O'Regan	559e03605a	b'	2017-06-27 22:42:16 +01:00
Paul O'Leary McCann	e56fea14eb	Add basic Japanese tokenizer test	2017-06-28 01:24:25 +09:00
Paul O'Leary McCann	84041a2bb5	Make create_tokenizer work with Japanese	2017-06-28 01:18:05 +09:00
Ines Montani	f69ff15089	Update CONTRIBUTORS.md	2017-06-27 14:49:02 +02:00
Ines Montani	e265e34e18	Merge pull request #1153 from jimregan/polish add tokeniser exceptions for Polish	2017-06-27 14:48:00 +02:00
Jim Regan	d81ceb0cd5	Merge branch 'develop' into polish	2017-06-26 22:42:27 +01:00
Jim O'Regan	2f84c73585	a start	2017-06-26 22:40:04 +01:00
Jim O'Regan	28d7f0a672	reference	2017-06-26 22:38:28 +01:00
Jim O'Regan	e12defdd9c	missed a couple	2017-06-26 22:24:14 +01:00
Jim O'Regan	c1e4e0f3bf	just now discovered that you can do multiwords	2017-06-26 22:19:39 +01:00
Jim O'Regan	5e5f94c1c0	fix dup	2017-06-26 21:57:00 +01:00
Jim O'Regan	a8dff9133e	add POS	2017-06-26 21:53:41 +01:00
Jim O'Regan	3c4d83aa6e	CLA	2017-06-26 21:32:48 +01:00
Jim O'Regan	e9213f54de	missed one	2017-06-26 21:29:21 +01:00
Jim O'Regan	1eb7cc3017	attempt a port from #1147	2017-06-26 21:24:55 +01:00
Ines Montani	d6e08f2bf6	Merge pull request #1142 from garfieldnate/patch-1 fix confusing typo	2017-06-26 10:41:47 +02:00
Ines Montani	01c7c09c7f	Merge pull request #1146 from jarle/doc-patch Fix small typo in the new spaCy 101 guide	2017-06-26 10:41:18 +02:00
Jarle Mathiesen	f20533ec0c	fix small typo	2017-06-24 12:31:33 +02:00
Nathan Glenn	81166c3d56	fix confusing typo This document describes the `Vocab` class, not the `Span` class.	2017-06-21 19:22:30 +02:00
Matthew Honnibal	91e52543ef	Merge pull request #1118 from Gregory-Howard/patch-2 Update _tokenizer_exceptions_list (adding cities)	2017-06-20 11:16:07 +02:00
Matthew Honnibal	8ea785e01a	Merge pull request #1119 from oroszgy/patch-3 Fixed conllu converter	2017-06-20 11:14:41 +02:00
Ines Montani	9335736c20	Merge pull request #1127 from bartbroere/master Fixed a minor typo in the documentation	2017-06-13 13:15:20 +02:00
Ines Montani	f64e3efc76	Merge pull request #1128 from thinline72/patch-1 Changed the capital of Lithuania to Vilnius	2017-06-13 13:14:43 +02:00
Savva Kolbachev	800a8faff4	Changed the capital of Lithuania to Vilnius Hi, There is a typo about the capital of Lithuania. Vilnius is the capital of Lithuania https://en.wikipedia.org/wiki/Vilnius Ljubljana is the capital of Slovenia https://en.wikipedia.org/wiki/Ljubljana	2017-06-12 23:27:00 +03:00
Bart Broere	e3be243e06	Merge pull request #1 from explosion/master Update	2017-06-12 22:06:59 +02:00
Ines Montani	6eae9f943a	Merge pull request #1125 from Tpt/french_noun_chunks Adds function to extract french noun chunks	2017-06-12 21:25:33 +02:00
Ines Montani	57f64b9e1c	Merge pull request #1124 from v3t3a/patch-3 docs - Fix url error for Displacy Ent visualizer	2017-06-12 21:20:32 +02:00
Ines Montani	b2a28028cf	Merge pull request #1115 from v3t3a/patch-2 docs - Add read() method when opening file (Lightning tour)	2017-06-12 21:19:25 +02:00

... 52 53 54 55 56 ...

8727 Commits