spaCy/spacy
Paul O'Leary McCann 61ef0739b8 Add Japanese stop words. (#2549)
List created by taking the 2000 top words from a Wikipedia dump and
removing everything that wasn't hiragana.

Tried going through kanji words and deciding what to keep but there were
too many obvious non-stopwords (東京 was in the top 500) and many other
words where it wasn't clear if they should be included or not.
2018-07-17 10:12:48 +02:00
..
cli
data Make spacy/data a package 2017-03-18 20:04:22 +01:00
displacy
lang Add Japanese stop words. (#2549) 2018-07-17 10:12:48 +02:00
syntax
tests
tokens
__init__.pxd
__init__.py
__main__.py
_ml.py
about.py
attrs.pxd
attrs.pyx
compat.py
errors.py
glossary.py
gold.pxd
gold.pyx
language.py
lemmatizer.py
lexeme.pxd
lexeme.pyx
matcher.pyx
morphology.pxd
morphology.pyx
parts_of_speech.pxd
parts_of_speech.pyx
pipeline.pxd
pipeline.pyx
scorer.py
strings.pxd
strings.pyx
structs.pxd
symbols.pxd
symbols.pyx
tokenizer.pxd
tokenizer.pyx
typedefs.pxd
typedefs.pyx
util.py
vectors.pyx 💫 New system for error messages and warnings (#2163) 2018-04-03 15:50:31 +02:00
vocab.pxd
vocab.pyx