Update v2-1.md

This commit is contained in:
Ines Montani 2019-03-21 12:29:08 +01:00
parent 9394ca1f29
commit 375fbf3586

View File

@ -250,9 +250,14 @@ if all of your models are up to date, you can run the
+ data = nlp.tokenizer.to_bytes(exclude=["vocab"]) + data = nlp.tokenizer.to_bytes(exclude=["vocab"])
``` ```
- The .pos value for several common English words has changed, due to
corrections to long-standing mistakes in the English tag map (see
[issue #593](https://github.com/explosion/spaCy/issues/593) and
[issue #3311](https://github.com/explosion/spaCy/issues/3311) for details).
- For better compatibility with the Universal Dependencies data, the lemmatizer - For better compatibility with the Universal Dependencies data, the lemmatizer
now preserves capitalization, e.g. for proper nouns. See now preserves capitalization, e.g. for proper nouns. See
[this issue](https://github.com/explosion/spaCy/issues/3256) for details. [issue #3256](https://github.com/explosion/spaCy/issues/3256) for details.
- The built-in rule-based sentence boundary detector is now only called - The built-in rule-based sentence boundary detector is now only called
`"sentencizer"` the name `"sbd"` is deprecated. `"sentencizer"` the name `"sbd"` is deprecated.