Commit Graph

122 Commits

Author SHA1 Message Date
ines
417d45f5d0 Add lemmatizer data as variable on language data
Don't create lookup lemmatizer within Language class and just pass in
the data so it can be set on Token creation
2017-10-11 02:24:58 +02:00
ines
0c2343d73a Tidy up language data 2017-10-11 02:22:49 +02:00
Matthew Honnibal
b29e6bff46 Improve lemmatization rule for am|VBP 2017-09-04 15:18:10 +02:00
ines
a68dc891ea Port over changes from #1281 2017-08-21 23:19:18 +02:00
ines
1fe5e1a4d1 Add language example sentences (see #1107)
da, de, en, es, fr, he, it, nb, pl, pt, sv
2017-08-19 12:22:29 +02:00
mollerhoj
23025d3b05 Clean up a couple of strange English stopwords 2017-07-03 15:41:59 +02:00
Matthew Honnibal
e28f90b672 Fix syntax iterators 2017-06-04 15:51:50 -05:00
Matthew Honnibal
3f5c85d8de Reorder setting of lex attrs, to avoid clobbering 2017-06-03 14:47:55 -05:00
Matthew Honnibal
de3954843e Populate norm exceptions with lower-case 2017-06-03 14:47:12 -05:00
ines
5bd311c77e Fix update of norm exceptions 2017-06-03 20:54:09 +02:00
ines
746653880c Add English norm exceptions to lex_attrs 2017-06-03 20:27:28 +02:00
ines
095eeeb12f Update English tokenizer exceptions and add norms 2017-06-03 20:27:16 +02:00
ines
33e332e67c Remove unused export 2017-05-28 00:57:59 +02:00
Matthew Honnibal
5db89053aa Merge docstrings 2017-05-21 13:46:23 -05:00
ines
924e8506de Move Defaults subclass to module scope (necessary for pickling) 2017-05-20 19:02:27 +02:00
Matthew Honnibal
61fe55efba Move EnglishDefaults class out of English 2017-05-20 02:18:19 -05:00
ines
1a05078c79 Add language-specific syntax iterators to en and de 2017-05-17 12:04:03 +02:00
ines
2f870123bf Fix formatting 2017-05-12 15:37:20 +02:00
ines
12c3d5fbba Fix formatting 2017-05-09 01:15:28 +02:00
ines
88adeee548 Add English lex_attrs overrides 2017-05-09 01:09:52 +02:00
ines
73b577cb01 Fix relative imports 2017-05-08 22:29:04 +02:00
ines
f46ffe3e89 Move language data to /lang module 2017-05-08 20:00:40 +02:00