mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-24 17:06:29 +03:00
Update v2-2.md [ci skip]
This commit is contained in:
parent
f52b857953
commit
ddc09b08ed
|
@ -341,6 +341,11 @@ check if all of your models are up to date, you can run the
|
|||
them). If your data contains invalid entity annotations, make sure to clean it
|
||||
and resolve conflicts. You can now also use the new `debug-data` command to
|
||||
find problems in your data.
|
||||
- Pipeline components can now overwrite IOB tags of tokens that are not yet part
|
||||
of an entity. Once a token has an `ent_iob` value set, it won't be reset to an
|
||||
"unset" state and will always have at least `O` assigned. `list(doc.ents)` now
|
||||
actually keeps the annotations on the token level consistent, instead of
|
||||
resetting `O` to an empty string.
|
||||
- The default punctuation in the `sentencizer` has been extended and now
|
||||
includes more characters common in various languages. This also means that the
|
||||
results it produces may change, depending on your text. If you want the
|
||||
|
|
Loading…
Reference in New Issue
Block a user