spaCy/spacy/tokens
Adriane Boyd 4a615cacd2
Consolidate and freeze symbols (#11352)
* Consolidate and freeze symbols

Instead of having symbol values defined in three potentially conflicting
places (`spacy.attrs`, `spacy.parts_of_speech`, `spacy.symbols`), define
all symbols in `spacy.symbols` and reference those values in
`spacy.attrs` and `spacy.parts_of_speech`.

Remove deprecated and placeholder symbols from `spacy.attrs.IDS`.

Make `spacy.attrs.NAMES` and `spacy.symbols.NAMES` reverse dicts rather
than lists in order to support future use of hash values in `attr_id_t`.

Minor changes:

* Use `uint64_t` for attrs in `Doc.to_array` to support future use of
hash values
* Remove unneeded attrs filter for error message in `Doc.to_array`
* Remove unused attr `SENT_END`

* Handle dynamic size of attr_id_t in Doc.to_array

* Undo added warnings

* Refactor to make Doc.to_array more similar to Doc.from_array

* Improve refactoring
2022-09-02 09:08:40 +02:00
..
__init__.pxd * Break up tokens.pyx into tokens/doc.pyx, tokens/token.pyx, tokens/spans.pyx 2015-07-13 20:20:58 +02:00
__init__.py Make stable private modules public and adjust names (#11353) 2022-08-30 13:56:35 +02:00
doc_bin.py Make stable private modules public and adjust names (#11353) 2022-08-30 13:56:35 +02:00
doc.pxd Set as_tuples on Doc during processing (#9592) 2021-11-02 15:08:22 +01:00
doc.pyi Make stable private modules public and adjust names (#11353) 2022-08-30 13:56:35 +02:00
doc.pyx Consolidate and freeze symbols (#11352) 2022-09-02 09:08:40 +02:00
graph.pxd Add SpanGroup and Graph container types to represent arbitrary annotations (#6696) 2021-01-14 17:30:41 +11:00
graph.pyx Refactor error messages to remove hardcoded strings (#10729) 2022-05-02 13:38:46 +02:00
morphanalysis.pxd Morphology/Morphologizer optimizations and refactoring (#11024) 2022-07-15 11:14:08 +02:00
morphanalysis.pyi 🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167) 2021-10-14 15:21:40 +02:00
morphanalysis.pyx Morphology/Morphologizer optimizations and refactoring (#11024) 2022-07-15 11:14:08 +02:00
retokenizer.pyi Make stable private modules public and adjust names (#11353) 2022-08-30 13:56:35 +02:00
retokenizer.pyx Make stable private modules public and adjust names (#11353) 2022-08-30 13:56:35 +02:00
span_group.pxd Span/SpanGroup: wrap SpanC in shared_ptr (#9869) 2022-01-12 13:38:52 +01:00
span_group.pyi Fix: De/Serialize SpanGroups including the SpanGroup keys (#10707) 2022-06-02 15:56:27 +02:00
span_group.pyx Merge remote-tracking branch 'upstream/master' into v4-merge-master-20220518 2022-05-18 11:34:54 +02:00
span_groups.py Make stable private modules public and adjust names (#11353) 2022-08-30 13:56:35 +02:00
span.pxd Span/SpanGroup: wrap SpanC in shared_ptr (#9869) 2022-01-12 13:38:52 +01:00
span.pyi Make Span/Doc.ents more consistent for ent_kb_id and ent_id (#11328) 2022-08-22 20:28:57 +02:00
span.pyx Make Span/Doc.ents more consistent for ent_kb_id and ent_id (#11328) 2022-08-22 20:28:57 +02:00
token.pxd cleanup 2021-01-13 14:20:05 +01:00
token.pyi 🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167) 2021-10-14 15:21:40 +02:00
token.pyx Merge remote-tracking branch 'upstream/master' into merge-master-v4-20220728 2022-07-28 13:53:59 +02:00
underscore.py Update typing hints (#10109) 2022-01-28 16:59:54 +01:00