spaCy/spacy
2016-10-24 14:22:51 +02:00
..
de Add LANG attribute to English and German 2016-10-18 18:52:48 +02:00
en Try to fix weird install glitch. 2016-10-23 19:46:28 +02:00
fi access model via sputnik 2015-12-07 06:01:28 +01:00
it access model via sputnik 2015-12-07 06:01:28 +01:00
munge * Fix Python3 problem in align_raw 2015-07-28 16:06:53 +02:00
serialize Fix Issue #459 -- failed to deserialize empty doc. 2016-10-23 16:31:05 +02:00
syntax Fix issue #514 -- serializer fails when new entity type has been added. The fix here is quite ugly. It's best to add the entities ASAP after loading the NLP pipeline, to mitigate the brittleness. 2016-10-23 17:45:44 +02:00
tests Test workaround for Issue #285: Streaming data memory growth 2016-10-24 13:48:06 +02:00
tokens Fix Issue #461: O tag was being clobbered by doc.ents.__set__ 2016-10-23 15:50:26 +02:00
zh * Work on Chinese support 2016-05-05 11:39:12 +02:00
__init__.pxd * Seems to be working after refactor. Need to wire up more POS tag features, and wire up save/load of POS tags. 2014-10-24 02:23:42 +11:00
__init__.py Fix loading of GloVe vectors, to address Issue #541 2016-10-20 18:27:48 +02:00
about.py Increment version 2016-10-23 20:22:53 +02:00
attrs.pxd introduce lang field for LexemeC to hold language id 2016-03-10 13:01:34 +01:00
attrs.pyx introduce lang field for LexemeC to hold language id 2016-03-10 13:01:34 +01:00
cfile.pxd * Add cfile.pyx 2015-07-23 01:10:36 +02:00
cfile.pyx Handle pathlib.Path objects in CFile 2016-09-24 22:01:46 +02:00
deprecated.py Finish refactoring data loading 2016-09-24 20:26:17 +02:00
download.py Make installation print data path. 2016-10-23 19:46:44 +02:00
gold.pxd * Remove unused import 2015-07-25 18:11:16 +02:00
gold.pyx Fix json loading, for Python 3. 2016-10-20 21:23:26 +02:00
language.py Fix Issue #566 2016-10-23 20:19:01 +02:00
lemmatizer.py Fix json loading, for Python 3. 2016-10-20 21:23:26 +02:00
lexeme.pxd Remove stray .tensor attribute from Lexeme 2016-10-18 01:16:32 +02:00
lexeme.pyx Fix vector_norm when vector is assigned to Lexeme. 2016-10-23 14:23:56 +02:00
matcher.pyx Fix JSON encoding issue on load 2016-10-20 21:06:48 +02:00
morphology.pxd Revert "Changes to morphology.pyx for new StringStore scheme" 2016-09-30 20:20:02 +02:00
morphology.pyx Revert "Changes to morphology.pyx for new StringStore scheme" 2016-09-30 20:20:02 +02:00
multi_words.py * Fix Issue #50: Python 3 compatibility of v0.80 2015-04-13 05:59:43 +02:00
orth.pxd remove text-unidecode dependency 2016-02-24 08:01:59 +01:00
orth.pyx introduce lang field for LexemeC to hold language id 2016-03-10 13:01:34 +01:00
parts_of_speech.pxd * Fix parts_of_speech now that symbols list has been reformed 2015-10-13 13:44:40 +11:00
parts_of_speech.pyx * Fix NAMES list in spacy/parts_of_speech.pyx 2015-10-13 14:18:45 +11:00
pipeline.pxd Refactor the pipeline classes to make them more consistent, and remove the redundant blank() constructor. 2016-10-16 21:34:57 +02:00
pipeline.pyx Fix issue #514 -- serializer fails when new entity type has been added. The fix here is quite ugly. It's best to add the entities ASAP after loading the NLP pipeline, to mitigate the brittleness. 2016-10-23 17:45:44 +02:00
scorer.py Refactor training, with new spacy.train module. Defaults still a little awkward. 2016-10-09 12:24:24 +02:00
strings.pxd Update strings.pxd 2016-10-24 14:00:35 +02:00
strings.pyx Fix Python 3 basestring error 2016-10-24 14:22:51 +02:00
structs.pxd Initial, limited support for quantified patterns in Matcher, and tracking of ent_id attribute in Token and Span. The quantifiers need a lot more testing, and there are some known problems. The main known problem is that the zero-plus and one-plus quantifiers won't work if a token can match both the quantified pattern expression AND the tail of the match. 2016-09-21 14:54:55 +02:00
symbols.pxd German noun chunk iterator now doesn't return tokens more than once 2016-05-03 16:58:59 +02:00
symbols.pyx Make sure symbols are unicode strings 2016-09-30 20:02:19 +02:00
tagger.pxd Add cfg field to Tagger 2016-10-17 01:03:41 +02:00
tagger.pyx Fix JSON in tagger 2016-10-21 01:44:10 +02:00
tokenizer.pxd Finish refactoring data loading 2016-09-24 20:26:17 +02:00
tokenizer.pyx Fix JSON in tokenizer 2016-10-21 01:44:20 +02:00
train.py Fix spacy.train 2016-10-15 23:53:46 +02:00
typedefs.pxd Revert "Work on Issue #285: intern strings into document-specific pools, to address streaming data memory growth. StringStore.__getitem__ now raises KeyError when it can't find the string. Use StringStore.intern() to get the old behaviour. Still need to hunt down all uses of StringStore.__getitem__ in library and do testing, but logic looks good." 2016-09-30 20:20:22 +02:00
typedefs.pyx * Move POS tag definitions to parts_of_speech.pxd 2015-01-25 16:31:07 +11:00
util.py Return None in match_best_version if not path exists. 2016-10-15 14:47:29 +02:00
vocab.pxd Revert "Work on Issue #285: intern strings into document-specific pools, to address streaming data memory growth. StringStore.__getitem__ now raises KeyError when it can't find the string. Use StringStore.intern() to get the old behaviour. Still need to hunt down all uses of StringStore.__getitem__ in library and do testing, but logic looks good." 2016-09-30 20:20:22 +02:00
vocab.pyx Fix vector norm when loading lexemes. 2016-10-23 19:40:18 +02:00