mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-25 01:16:28 +03:00
Update v2-1.md
This commit is contained in:
parent
67e38690d4
commit
9a8f169e5c
|
@ -237,6 +237,19 @@ if all of your models are up to date, you can run the
|
|||
+ retokenizer.merge(doc[6:8])
|
||||
```
|
||||
|
||||
- The serialization methods `to_disk`, `from_disk`, `to_bytes` and `from_bytes`
|
||||
now support a single `exclude` argument to provide a list of string names to
|
||||
exclude. The docs have been updated to list the available serialization fields
|
||||
for each class. The `disable` argument on the [`Language`](/api/language)
|
||||
serialization methods has been renamed to `exclude` for consistency.
|
||||
|
||||
```diff
|
||||
- nlp.to_disk("/path", disable=["parser", "ner"])
|
||||
+ nlp.to_disk("/path", exclude=["parser", "ner"])
|
||||
- data = nlp.tokenizer.to_bytes(vocab=False)
|
||||
+ data = nlp.tokenizer.to_bytes(exclude=["vocab"])
|
||||
```
|
||||
|
||||
- For better compatibility with the Universal Dependencies data, the lemmatizer
|
||||
now preserves capitalization, e.g. for proper nouns. See
|
||||
[this issue](https://github.com/explosion/spaCy/issues/3256) for details.
|
||||
|
|
Loading…
Reference in New Issue
Block a user