spaCy/spacy
Viet Trung Tran ea2af94cd9 Add support for Vietnamese in spaCy by leveraging Pyvi, an external Vietnamese tokenizer (#2155)
* support for Vietnamese

* Contributor Agreement for adding Vietnamese support on spaCy
2018-03-29 12:19:51 +02:00
..
cli Merge pull request #2158 from explosion/feature/fix-multiple-vectors (resolves #1660) 2018-03-28 23:08:24 +02:00
data Make spacy/data a package 2017-03-18 20:04:22 +01:00
displacy Don't use deprecated Doc.merge call in displaCy 2018-01-27 11:25:05 +01:00
lang Add support for Vietnamese in spaCy by leveraging Pyvi, an external Vietnamese tokenizer (#2155) 2018-03-29 12:19:51 +02:00
syntax Merge pull request #2158 from explosion/feature/fix-multiple-vectors (resolves #1660) 2018-03-28 23:08:24 +02:00
tests Merge pull request #2159 from explosion/feature/fix-merged-entity-iob (resolves #1554, resolves #1752) 2018-03-28 23:10:00 +02:00
tokens Fix ent_iob tags in doc.merge to avoid inconsistent sequences 2018-03-28 18:39:03 +02:00
__init__.pxd * Seems to be working after refactor. Need to wire up more POS tag features, and wire up save/load of POS tags. 2014-10-24 02:23:42 +11:00
__init__.py Remove dummy variable from function calls 2018-01-05 09:37:05 +01:00
__main__.py Don't pass CLI command name as dummy argument 2018-01-04 21:33:47 +01:00
_ml.py Dont assume pretrained_vectors cfg set in build_tagger 2018-03-28 20:12:45 +02:00
about.py Set version to v2.0.10 2018-03-24 18:09:03 +01:00
attrs.pxd Fix LANG symbol 2018-02-17 18:10:50 +01:00
attrs.pyx code for is_currency 2018-02-11 18:51:32 +01:00
compat.py Remove ftfy dependency and update docs 2018-03-28 12:09:42 +02:00
glossary.py Fix typo in glossary (resolves #1964) 2018-02-10 11:58:41 +01:00
gold.pxd Add support for sent_start to GoldParse 2017-08-25 20:03:14 -05:00
gold.pyx Add offsets_from_biluo_tags helper and tests (see #1626) 2017-11-26 16:38:01 +01:00
language.py Avoid forcing a name on empty vectors, and remove print statement 2018-03-28 21:08:58 +02:00
lemmatizer.py If no rules are set, lemmatize by lookup 2017-12-06 12:12:11 +01:00
lexeme.pxd WIP on stringstore change. 27 failures 2017-05-28 14:06:40 +02:00
lexeme.pyx added new lexical feat to lexeme 2018-02-11 18:51:48 +01:00
matcher.pyx Revert matcher fixes from GregDubbin 2018-02-18 10:59:28 +01:00
morphology.pxd Remove cpdef enum, to avoid too much code generation 2017-10-20 13:59:57 +02:00
morphology.pyx Fix non-clobbering lemmatization 2017-11-06 12:36:05 +01:00
parts_of_speech.pxd Add support for Universal Dependencies v2.0 2017-03-03 13:17:34 +01:00
parts_of_speech.pyx Tidy up rest 2017-10-27 21:07:59 +02:00
pipeline.pxd Fix names of pipeline components 2017-10-26 12:38:23 +02:00
pipeline.pyx Remove print statement 2018-03-28 17:48:37 +02:00
scorer.py Tidy up rest 2017-10-27 21:07:59 +02:00
strings.pxd Try to fix StringStore clean up (see #1506) 2017-11-11 03:11:27 +03:00
strings.pyx Use safer method to get string without hit 2017-11-14 22:58:46 +03:00
structs.pxd Make TokenC.sent_tart an int, to allow ternary value 2017-10-08 19:58:54 +02:00
symbols.pxd Fix LANG symbol 2018-02-17 18:10:50 +01:00
symbols.pyx Add missing symbol for LANG attr. Fixes inconsistent numeric ID 2018-02-17 17:37:02 +01:00
tokenizer.pxd Disable tokenizer cache for special-cases. Fixes #1250 2017-10-24 16:08:05 +02:00
tokenizer.pyx Merge pull request #1611 from fsonntag/master 2017-11-29 23:11:23 +01:00
typedefs.pxd Work on changing StringStore to return hashes. 2017-05-28 12:36:27 +02:00
typedefs.pyx Tidy up rest 2017-10-27 21:07:59 +02:00
util.py Add contributor agreement for emulbreh 2018-02-13 13:40:33 +01:00
vectors.pyx Fix loading of multiple pre-trained vectors 2018-03-28 16:02:59 +02:00
vocab.pxd Add Vocab.cfg attr, to hold stuff like oov probs 2017-10-30 16:08:50 +01:00
vocab.pyx Fix loading of multiple pre-trained vectors 2018-03-28 16:02:59 +02:00