spaCy/spacy
ines 1da29a7146 Use new Lemmatizer data and remove file import
Since there's currently only an English lemmatizer, the global
Lemmatizer imports from spacy.en. This is unideal and still needs to be
fixed.
2017-03-12 13:58:22 +01:00
..
bn Merge pull request #885 from PySUST/master 2017-03-12 13:20:59 +01:00
de Use consistent unicode declarations 2017-03-12 13:07:28 +01:00
en Add Python-formatted lemmatizer data and rules 2017-03-12 13:58:22 +01:00
es Use consistent unicode declarations 2017-03-12 13:07:28 +01:00
fi Use consistent unicode declarations 2017-03-12 13:07:28 +01:00
fr Use consistent unicode declarations 2017-03-12 13:07:28 +01:00
hu Use consistent unicode declarations 2017-03-12 13:07:28 +01:00
it Use consistent unicode declarations 2017-03-12 13:07:28 +01:00
language_data Use consistent unicode declarations 2017-03-12 13:07:28 +01:00
munge * Fix Python3 problem in align_raw 2015-07-28 16:06:53 +02:00
nl Use consistent unicode declarations 2017-03-12 13:07:28 +01:00
pt Use consistent unicode declarations 2017-03-12 13:07:28 +01:00
serialize Fix Issue #459 -- failed to deserialize empty doc. 2016-10-23 16:31:05 +02:00
sv Use consistent unicode declarations 2017-03-12 13:07:28 +01:00
syntax Add header for beam parser 2017-03-11 12:45:12 -06:00
tests Use new Lemmatizer data and remove file import 2017-03-12 13:58:22 +01:00
tokens Use consistent unicode declarations 2017-03-12 13:07:28 +01:00
zh Import Jieba inside zh.make_doc 2016-11-02 23:49:19 +01:00
__init__.pxd * Seems to be working after refactor. Need to wire up more POS tag features, and wire up save/load of POS tags. 2014-10-24 02:23:42 +11:00
__init__.py add basic Bengali support 2017-02-28 07:48:37 +06:00
about.py Fix title to accommodate sputnik 2017-01-17 00:51:09 +01:00
attrs.pxd Whitespace 2016-12-18 16:51:40 +01:00
attrs.pyx Add support for Universal Dependencies v2.0 2017-03-03 13:17:34 +01:00
cfile.pxd Add hacky support for StringCFile, to make pickling easier. 2017-03-07 20:24:37 +01:00
cfile.pyx Fixes to hacky vocab pickling 2017-03-07 20:58:55 +01:00
deprecated.py Finish refactoring data loading 2016-09-24 20:26:17 +02:00
download.py Fix missing import 2017-01-19 22:03:52 +11:00
gold.pxd Fix gold.pyx for 1.0 2016-11-25 08:57:59 -06:00
gold.pyx Merge old training fixes with newer state 2016-11-25 09:16:36 -06:00
language.py Pass path argument to end_training 2017-03-09 18:42:40 -06:00
lemmatizer.py Use new Lemmatizer data and remove file import 2017-03-12 13:58:22 +01:00
lexeme.pxd Remove stray .tensor attribute from Lexeme 2016-10-18 01:16:32 +02:00
lexeme.pyx Fix doc strings 2016-11-01 12:25:36 +01:00
matcher.pyx Add 1 operator to matcher, and make sure open patterns are closed at end of document. Closes Issue #766 2017-02-24 14:27:02 +01:00
morphology.pxd Add support for Universal Dependencies v2.0 2017-03-03 13:17:34 +01:00
morphology.pyx Add support for Universal Dependencies v2.0 2017-03-03 13:17:34 +01:00
multi_words.py * Fix Issue #50: Python 3 compatibility of v0.80 2015-04-13 05:59:43 +02:00
orth.pxd remove text-unidecode dependency 2016-02-24 08:01:59 +01:00
orth.pyx Use consistent unicode declarations 2017-03-12 13:07:28 +01:00
parts_of_speech.pxd Add support for Universal Dependencies v2.0 2017-03-03 13:17:34 +01:00
parts_of_speech.pyx Add support for Universal Dependencies v2.0 2017-03-03 13:17:34 +01:00
pipeline.pxd Add classes for beam parser and beam NER 2017-03-11 12:45:37 -06:00
pipeline.pyx Add classes for beam parser and beam NER 2017-03-11 12:45:37 -06:00
scorer.py Allow dep to be None in scorer, for missing labels. 2016-11-25 09:02:49 -06:00
strings.pxd Update strings.pxd 2016-10-24 14:00:35 +02:00
strings.pyx Add support for pickling StringStore. 2017-03-07 17:15:18 +01:00
structs.pxd Initial, limited support for quantified patterns in Matcher, and tracking of ent_id attribute in Token and Span. The quantifiers need a lot more testing, and there are some known problems. The main known problem is that the zero-plus and one-plus quantifiers won't work if a token can match both the quantified pattern expression AND the tail of the match. 2016-09-21 14:54:55 +02:00
symbols.pxd Add support for Universal Dependencies v2.0 2017-03-03 13:17:34 +01:00
symbols.pyx Add support for Universal Dependencies v2.0 2017-03-03 13:17:34 +01:00
tagger.pxd Add cfg field to Tagger 2016-10-17 01:03:41 +02:00
tagger.pyx Add itn argument to tagger.update 2017-03-11 11:12:21 -06:00
tokenizer.pxd Revert "Revert "Merge remote-tracking branch 'origin/master'"" 2017-01-09 13:28:13 +01:00
tokenizer.pyx Fix handling of trailing whitespace 2017-03-08 15:01:40 +01:00
train.py Improve output on trainer 2017-03-11 11:12:48 -06:00
typedefs.pxd Revert "Work on Issue #285: intern strings into document-specific pools, to address streaming data memory growth. StringStore.__getitem__ now raises KeyError when it can't find the string. Use StringStore.intern() to get the old behaviour. Still need to hunt down all uses of StringStore.__getitem__ in library and do testing, but logic looks good." 2016-09-30 20:20:22 +02:00
typedefs.pyx * Move POS tag definitions to parts_of_speech.pxd 2015-01-25 16:31:07 +11:00
util.py Use consistent unicode declarations 2017-03-12 13:07:28 +01:00
vocab.pxd Revert "Work on Issue #285: intern strings into document-specific pools, to address streaming data memory growth. StringStore.__getitem__ now raises KeyError when it can't find the string. Use StringStore.intern() to get the old behaviour. Still need to hunt down all uses of StringStore.__getitem__ in library and do testing, but logic looks good." 2016-09-30 20:20:22 +02:00
vocab.pyx Squelch compiler warnings 2017-03-11 12:44:43 -06:00