spaCy

mirror of https://github.com/explosion/spaCy.git synced 2024-11-11 12:18:04 +03:00

Author	SHA1	Message	Date
Kit	9bc524982e	Find lowercased forms of numeric words	2018-01-08 03:25:08 +01:00
Kevin Humphreys	7918fa4ef9	handle would've	2018-01-03 12:25:48 -08:00
Mathias Deschamps	c0691b2ab4	Add tokenizer exceptions for ing verbs Extend list of tokenizing exceptions introduced in `123810b`	2017-11-13 17:46:05 +01:00
Mathias Deschamps	288298ead9	Add norm exception for ing verbs Some ing verbs are sometimes written in or in'. Make the NORM form correct	2017-11-13 17:46:05 +01:00
ines	123810b6de	Add "lovin'" to tokenizer exceptions (see #1248 )	2017-11-09 17:09:30 +01:00
ines	acb9bdb852	Fix PRON_LEMMA imports	2017-11-06 17:41:53 +01:00
ines	819e30a26e	Tidy up tokenizer exceptions	2017-11-01 23:02:45 +01:00
ines	9659391944	Update deprecated methods and add warnings	2017-11-01 16:49:42 +01:00
ines	7e424a1804	Don't copy exception dicts if not necessary and tidy up	2017-10-31 21:05:29 +01:00
Ines Montani	d3bf488e16	Merge pull request #1171 from mollerhoj/support-danish Improve basic support for Danish	2017-10-24 20:29:57 +02:00
Matthew Honnibal	66766c1454	Restore SP tag to English tag_map, until models migrate	2017-10-24 17:05:00 +02:00
Ines Montani	facf77e541	Merge branch 'develop' into support-danish	2017-10-24 11:53:19 +02:00
Matthew Honnibal	49895fbef6	Rename 'SP' special tag to '_SP' Renaming the tag with an underscore lets us add it to the tag map without worrying that we'll change the sequence of tags, which throws off the tag-to-ID mapping. For instance, if we inserted a 'SP' tag, the "VERB" tag is pushed to a different class ID, and the model is all messed up.	2017-10-20 14:01:12 +02:00
Matthew Honnibal	839de87ca9	Make lambda func a named function, for pickling	2017-10-17 18:21:20 +02:00
ines	38c756fd85	Port over changes from #1287	2017-10-14 13:16:21 +02:00
ines	8ce6f96180	Don't make copies of language data components	2017-10-11 15:34:55 +02:00
ines	417d45f5d0	Add lemmatizer data as variable on language data Don't create lookup lemmatizer within Language class and just pass in the data so it can be set on Token creation	2017-10-11 02:24:58 +02:00
ines	0c2343d73a	Tidy up language data	2017-10-11 02:22:49 +02:00
Matthew Honnibal	b29e6bff46	Improve lemmatization rule for am\|VBP	2017-09-04 15:18:10 +02:00
ines	a68dc891ea	Port over changes from #1281	2017-08-21 23:19:18 +02:00
ines	1fe5e1a4d1	Add language example sentences (see #1107 ) da, de, en, es, fr, he, it, nb, pl, pt, sv	2017-08-19 12:22:29 +02:00
mollerhoj	23025d3b05	Clean up a couple of strange English stopwords	2017-07-03 15:41:59 +02:00
Matthew Honnibal	e28f90b672	Fix syntax iterators	2017-06-04 15:51:50 -05:00
Matthew Honnibal	3f5c85d8de	Reorder setting of lex attrs, to avoid clobbering	2017-06-03 14:47:55 -05:00
Matthew Honnibal	de3954843e	Populate norm exceptions with lower-case	2017-06-03 14:47:12 -05:00
ines	5bd311c77e	Fix update of norm exceptions	2017-06-03 20:54:09 +02:00
ines	746653880c	Add English norm exceptions to lex_attrs	2017-06-03 20:27:28 +02:00
ines	095eeeb12f	Update English tokenizer exceptions and add norms	2017-06-03 20:27:16 +02:00
ines	33e332e67c	Remove unused export	2017-05-28 00:57:59 +02:00
Matthew Honnibal	5db89053aa	Merge docstrings	2017-05-21 13:46:23 -05:00
ines	924e8506de	Move Defaults subclass to module scope (necessary for pickling)	2017-05-20 19:02:27 +02:00
Matthew Honnibal	61fe55efba	Move EnglishDefaults class out of English	2017-05-20 02:18:19 -05:00
ines	1a05078c79	Add language-specific syntax iterators to en and de	2017-05-17 12:04:03 +02:00
ines	2f870123bf	Fix formatting	2017-05-12 15:37:20 +02:00
ines	12c3d5fbba	Fix formatting	2017-05-09 01:15:28 +02:00
ines	88adeee548	Add English lex_attrs overrides	2017-05-09 01:09:52 +02:00
ines	73b577cb01	Fix relative imports	2017-05-08 22:29:04 +02:00
ines	f46ffe3e89	Move language data to /lang module	2017-05-08 20:00:40 +02:00

38 Commits