Update v2-1.md

This commit is contained in:
Ines Montani 2019-03-21 12:29:08 +01:00
parent 9394ca1f29
commit 375fbf3586

View File

@ -250,9 +250,14 @@ if all of your models are up to date, you can run the
+ data = nlp.tokenizer.to_bytes(exclude=["vocab"])
```
- The .pos value for several common English words has changed, due to
corrections to long-standing mistakes in the English tag map (see
[issue #593](https://github.com/explosion/spaCy/issues/593) and
[issue #3311](https://github.com/explosion/spaCy/issues/3311) for details).
- For better compatibility with the Universal Dependencies data, the lemmatizer
now preserves capitalization, e.g. for proper nouns. See
[this issue](https://github.com/explosion/spaCy/issues/3256) for details.
[issue #3256](https://github.com/explosion/spaCy/issues/3256) for details.
- The built-in rule-based sentence boundary detector is now only called
`"sentencizer"` the name `"sbd"` is deprecated.