spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-11-24 20:06:09 +03:00

History

Matthew Honnibal 8aa7882762 Make NORM a token attribute (#3029 ) See #3028. The solution in this patch is pretty debateable. What we do is give the TokenC struct a .norm field, by repurposing the previously idle .sense attribute. It's nice to repurpose a previous field because it means the TokenC doesn't change size, so even if someone's using the internals very deeply, nothing will break. The weird thing here is that the TokenC and the LexemeC both have an attribute named NORM. This arguably assists in backwards compatibility. On the other hand, maybe it's really bad! We're changing the semantics of the attribute subtly, so maybe it's better if someone calling lex.norm gets a breakage, and instead is told to write lex.default_norm? Overall I believe this patch makes the NORM feature work the way we sort of expected it to work. Certainly it's much more like how the docs describe it, and more in line with how we've been directing people to use the norm attribute. We'll also be able to use token.norm to do stuff like spelling correction, which is pretty cool.		2018-12-08 10:49:10 +01:00
..
__init__.pxd	* Break up tokens.pyx into tokens/doc.pyx, tokens/token.pyx, tokens/spans.pyx	2015-07-13 20:20:58 +02:00
__init__.py	Tidy up and document Doc, Token and Span	2017-10-27 15:41:45 +02:00
_retokenize.pyx	💫 Port master changes over to develop (#2979 )	2018-11-29 16:30:29 +01:00
_serialize.py	💫 Replace ujson, msgpack and dill/pickle/cloudpickle with srsly (#3003 )	2018-12-03 01:28:22 +01:00
doc.pxd	Merge master into develop. Big merge, many conflicts -- need to review	2018-04-29 14:49:26 +02:00
doc.pyx	Fix removabl of dill (for srsly)	2018-12-06 18:46:09 +01:00
span.pxd	Add Span.to_array method	2017-08-19 12:20:45 +02:00
span.pyx	Update develop from master	2018-08-14 03:04:28 +02:00
token.pxd	Make NORM a token attribute (#3029 )	2018-12-08 10:49:10 +01:00
token.pyx	Make NORM a token attribute (#3029 )	2018-12-08 10:49:10 +01:00
underscore.py	💫 Tidy up and auto-format .py files (#2983 )	2018-11-30 17:03:03 +01:00