spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-04-01 07:44:12 +03:00

Author	SHA1	Message	Date
ines	8ce6f96180	Don't make copies of language data components	2017-10-11 15:34:55 +02:00
ines	417d45f5d0	Add lemmatizer data as variable on language data Don't create lookup lemmatizer within Language class and just pass in the data so it can be set on Token creation	2017-10-11 02:24:58 +02:00
ines	0c2343d73a	Tidy up language data	2017-10-11 02:22:49 +02:00
ines	ece30c28a8	Don't split hyphenated words in German This way, the tokenizer matches the tokenization in German treebanks	2017-09-16 20:40:15 +02:00
ines	1fe5e1a4d1	Add language example sentences (see #1107 ) da, de, en, es, fr, he, it, nb, pl, pt, sv	2017-08-19 12:22:29 +02:00
Matthew Honnibal	e28f90b672	Fix syntax iterators	2017-06-04 15:51:50 -05:00
ines	fa7e576c57	Change order of exception dicts	2017-06-03 21:52:06 +02:00
ines	e47eef5e03	Update German tokenizer exceptions and tests	2017-06-03 21:07:44 +02:00
ines	0d6fa8b241	Add German norm exceptions	2017-06-03 20:54:18 +02:00
ines	924e8506de	Move Defaults subclass to module scope (necessary for pickling)	2017-05-20 19:02:27 +02:00
ines	1a05078c79	Add language-specific syntax iterators to en and de	2017-05-17 12:04:03 +02:00
ines	73b577cb01	Fix relative imports	2017-05-08 22:29:04 +02:00
ines	f46ffe3e89	Move language data to /lang module	2017-05-08 20:00:40 +02:00