spaCy/spacy
Andrew Ongko 81564cc4e8 Update Indonesian model (#2752)
* adding e-KTP in tokenizer exceptions list

* add exception token

* removing lines with containing space as it won't matter since we use .split() method in the end, added new tokens in exception

* add tokenizer exceptions list

* combining base_norms with norm_exceptions

* adding norm_exception

* fix double key in lemmatizer

* remove unused import on punctuation.py

* reformat stop_words to reduce number of lines, improve readibility

* updating tokenizer exception

* implement is_currency for lang/id

* adding orth_first_upper in tokenizer_exceptions

* update the norm_exception list

* remove bunch of abbreviations

* adding contributors file
2018-09-14 12:30:32 +02:00
..
cli Remove ')' for clarity (#2737) 2018-09-10 11:31:49 +02:00
data Make spacy/data a package 2017-03-18 20:04:22 +01:00
displacy fix issue #2452 - displacy arrow direction is always forward (#2506) (closes #2452) 2018-07-04 14:12:08 +02:00
lang Update Indonesian model (#2752) 2018-09-14 12:30:32 +02:00
syntax Fix loading of models when custom vectors are added 2018-04-10 22:19:20 +02:00
tests Introduces a bulk merge function, in order to solve issue #653 (#2696) 2018-09-10 16:41:42 +02:00
tokens Introduces a bulk merge function, in order to solve issue #653 (#2696) 2018-09-10 16:41:42 +02:00
__init__.pxd * Seems to be working after refactor. Need to wire up more POS tag features, and wire up save/load of POS tags. 2014-10-24 02:23:42 +11:00
__init__.py Silent keyword in info function in init (#2459) 2018-06-18 12:24:21 +02:00
__main__.py Don't pass CLI command name as dummy argument 2018-01-04 21:33:47 +01:00
_ml.py 💫 New system for error messages and warnings (#2163) 2018-04-03 15:50:31 +02:00
about.py Set about to v2.0.12 release 2018-07-21 15:09:42 +02:00
attrs.pxd Fix LANG symbol 2018-02-17 18:10:50 +01:00
attrs.pyx missing PrepCase attribute 2018-02-18 14:46:12 +00:00
compat.py Simplify is_config() and normalize_string_keys() (#2305) 2018-05-21 01:54:35 +02:00
errors.py Introduces a bulk merge function, in order to solve issue #653 (#2696) 2018-09-10 16:41:42 +02:00
glossary.py Add FAC to spacy.explain (resolves #2706) 2018-08-26 14:13:50 +02:00
gold.pxd Add support for sent_start to GoldParse 2017-08-25 20:03:14 -05:00
gold.pyx New Feature: display more detail when Error E067 (#2639) 2018-08-07 10:45:29 +02:00
language.py Remove docstrings for deprecated arguments (see #2703) 2018-08-26 14:23:13 +02:00
lemmatizer.py If no rules are set, lemmatize by lookup 2017-12-06 12:12:11 +01:00
lexeme.pxd WIP on stringstore change. 27 failures 2017-05-28 14:06:40 +02:00
lexeme.pyx 💫 New system for error messages and warnings (#2163) 2018-04-03 15:50:31 +02:00
matcher.pyx 💫 New system for error messages and warnings (#2163) 2018-04-03 15:50:31 +02:00
morphology.pxd fix typo/missing here too 2018-02-18 14:38:27 +00:00
morphology.pyx 💫 New system for error messages and warnings (#2163) 2018-04-03 15:50:31 +02:00
parts_of_speech.pxd Add support for Universal Dependencies v2.0 2017-03-03 13:17:34 +01:00
parts_of_speech.pyx Tidy up rest 2017-10-27 21:07:59 +02:00
pipeline.pxd Fix names of pipeline components 2017-10-26 12:38:23 +02:00
pipeline.pyx Fix loading of models when custom vectors are added 2018-04-10 22:19:20 +02:00
scorer.py 💫 New system for error messages and warnings (#2163) 2018-04-03 15:50:31 +02:00
strings.pxd Try to fix StringStore clean up (see #1506) 2017-11-11 03:11:27 +03:00
strings.pyx 💫 New system for error messages and warnings (#2163) 2018-04-03 15:50:31 +02:00
structs.pxd Make TokenC.sent_tart an int, to allow ternary value 2017-10-08 19:58:54 +02:00
symbols.pxd Fix inconsistencies in the symbols table 2018-02-18 13:51:31 +01:00
symbols.pyx Fix inconsistencies in the symbols table 2018-02-18 13:51:31 +01:00
tokenizer.pxd Disable tokenizer cache for special-cases. Fixes #1250 2017-10-24 16:08:05 +02:00
tokenizer.pyx Fix loading tokenizer with custom prefix search (#2495) 2018-07-04 12:56:07 +02:00
typedefs.pxd Work on changing StringStore to return hashes. 2017-05-28 12:36:27 +02:00
typedefs.pyx Tidy up rest 2017-10-27 21:07:59 +02:00
util.py Fix msgpack for new version 2018-07-20 17:32:00 +02:00
vectors.pyx 💫 New system for error messages and warnings (#2163) 2018-04-03 15:50:31 +02:00
vocab.pxd Add Vocab.cfg attr, to hold stuff like oov probs 2017-10-30 16:08:50 +01:00
vocab.pyx 💫 New system for error messages and warnings (#2163) 2018-04-03 15:50:31 +02:00