mirror of
https://github.com/explosion/spaCy.git
synced 2026-01-10 18:51:21 +03:00
The list of stop words for Spanish contained many inadequate words, see: https://github.com/explosion/spaCy/issues/3052#issuecomment-1100760100 Removed words: - verb forms of 'trabajar' (work) and intentar (try) - words related to 'empleo' (employment) - incorrect words: ampleamos, arribaabajo, soyos, paìs - miscellaneous words due to being too significant of too infrequent: actualmente, aproximadamente, antaño, cosas, ejemplo, horas, general, pais, principalmente, raras Added other stop words for completion: - Spanish one-letter words - numbers up to twelve Some reformatting to 79 columns. When in doubt, the English and German lists have been consulted as good examples. |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| examples.py | ||
| lemmatizer.py | ||
| lex_attrs.py | ||
| punctuation.py | ||
| stop_words.py | ||
| syntax_iterators.py | ||
| tokenizer_exceptions.py | ||