Update v2-1.md

This commit is contained in:
Ines Montani 2019-03-10 18:58:51 +01:00
parent 67e38690d4
commit 9a8f169e5c

View File

@ -237,6 +237,19 @@ if all of your models are up to date, you can run the
+ retokenizer.merge(doc[6:8])
```
- The serialization methods `to_disk`, `from_disk`, `to_bytes` and `from_bytes`
now support a single `exclude` argument to provide a list of string names to
exclude. The docs have been updated to list the available serialization fields
for each class. The `disable` argument on the [`Language`](/api/language)
serialization methods has been renamed to `exclude` for consistency.
```diff
- nlp.to_disk("/path", disable=["parser", "ner"])
+ nlp.to_disk("/path", exclude=["parser", "ner"])
- data = nlp.tokenizer.to_bytes(vocab=False)
+ data = nlp.tokenizer.to_bytes(exclude=["vocab"])
```
- For better compatibility with the Universal Dependencies data, the lemmatizer
now preserves capitalization, e.g. for proper nouns. See
[this issue](https://github.com/explosion/spaCy/issues/3256) for details.