mirror of
https://github.com/explosion/spaCy.git
synced 2025-10-25 21:21:10 +03:00
* Only set NORM on Token in retokenizer Instead of setting `NORM` on both the token and lexeme, set `NORM` only on the token. The retokenizer tries to set all possible attributes with `Token/Lexeme.set_struct_attr` so that it doesn't have to enumerate which attributes are available for each. `NORM` is the only attribute that's stored on both and for most cases it doesn't make sense to set the global norms based on a individual retokenization. For lexeme-only attributes like `IS_STOP` there's no way to avoid the global side effects, but I think that `NORM` would be better only on the token. * Fix test |
||
|---|---|---|
| .. | ||
| __init__.pxd | ||
| __init__.py | ||
| _retokenize.pyx | ||
| _serialize.py | ||
| doc.pxd | ||
| doc.pyx | ||
| morphanalysis.pxd | ||
| morphanalysis.pyx | ||
| span.pxd | ||
| span.pyx | ||
| token.pxd | ||
| token.pyx | ||
| underscore.py | ||