spaCy/spacy
Madeesh Kannan 446a3ecf34
StringStore refactoring (#11344)
* `strings`: Remove unused `hash32_utf8` function

* `strings`: Make `hash_utf8` and `decode_Utf8Str` private

* `strings`: Reorganize private functions

* 'strings': Raise error when non-string/-int types are passed to functions that don't accept them

* `strings`: Add `items()` method, add type hints, remove unused methods, restrict inputs to specific types, reorganize methods

* `Morphology`: Use `StringStore.items()` to enumerate features when pickling

* `test_stringstore`: Update pre-Python 3 tests

* Update `StringStore` docs

* Fix `get_string_id` imports

* Replace redundant test with tests for type checking

* Rename `_retrieve_interned_str`, remove `.get` default arg

* Add `get_string_id` to `strings.pyi`
Remove `mypy` ignore directives from imports of the above

* `strings.pyi`: Replace functions that consume `Union`-typed params with overloads

* `strings.pyi`: Revert some function signatures

* Update `SYMBOLS_BY_INT` lookups and error codes post-merge

* Revert clobbered change introduced in a previous merge

* Remove unnecessary type hint

* Invert tuple order in `StringStore.items()`

* Add test for `StringStore.items()`

* Revert "`Morphology`: Use `StringStore.items()` to enumerate features when pickling"

This reverts commit 1af9510ceb.

* Rename `keys` and `key_map`

* Add `keys()` and `values()`

* Add comment about the inverted key-value semantics in the API

* Fix type hints

* Implement `keys()`, `values()`, `items()` without generators

* Fix type hints, remove unnecessary boxing

* Update docs

* Simplify `keys/values/items()` impl

* `mypy` fix

* Fix error message, doc fixes
2022-10-06 10:51:06 +02:00
..
cli Allow overriding spacy_version in spacy package meta (#11552) 2022-09-29 10:44:06 +02:00
displacy Docs: displaCy documentation - data types, parse_{deps,ents,spans}, spans example (#10950) 2022-08-16 11:23:34 -04:00
kb Refactor KB for easier customization (#11268) 2022-09-08 10:38:07 +02:00
lang Merge branch 'copy_develop' into copy_v4 2022-10-03 14:12:16 +02:00
matcher StringStore refactoring (#11344) 2022-10-06 10:51:06 +02:00
ml Merge branch 'copy_develop' into copy_v4 2022-10-03 14:12:16 +02:00
pipeline Merge branch 'copy_develop' into copy_v4 2022-10-03 14:12:16 +02:00
tests StringStore refactoring (#11344) 2022-10-06 10:51:06 +02:00
tokens StringStore refactoring (#11344) 2022-10-06 10:51:06 +02:00
training Preserve missing entity annotation in augmenters (#11540) 2022-09-27 10:16:51 +02:00
__init__.pxd * Seems to be working after refactor. Need to wire up more POS tag features, and wire up save/load of POS tags. 2014-10-24 02:23:42 +11:00
__init__.py Simplify and clarify enable/disable behavior of spacy.load() (#11459) 2022-09-27 14:22:36 +02:00
__main__.py Tidy up 2020-06-22 00:45:40 +02:00
about.py Set version to v3.4.1 (#11209) 2022-07-26 12:52:38 +02:00
attrs.pxd Consolidate and freeze symbols (#11352) 2022-09-02 09:08:40 +02:00
attrs.pyx Consolidate and freeze symbols (#11352) 2022-09-02 09:08:40 +02:00
compat.py Custom component types in spacy.ty (#9469) 2021-10-21 15:31:06 +02:00
default_config_pretraining.cfg Add new parameter for saving every n epoch in pretraining (#8912) 2021-08-12 11:14:48 +02:00
default_config.cfg Add a few docs to the default_config.cfg (#9981) 2022-01-05 09:16:40 +01:00
errors.py StringStore refactoring (#11344) 2022-10-06 10:51:06 +02:00
glossary.py Add glossary entry for root (#10821) 2022-05-20 09:56:32 +02:00
language.py Simplify and clarify enable/disable behavior of spacy.load() (#11459) 2022-09-27 14:22:36 +02:00
lexeme.pxd Fix Lexeme.from_ptr 2020-08-10 16:43:37 +02:00
lexeme.pyi fix type of lexeme.rank (#9979) 2022-01-04 13:15:25 +01:00
lexeme.pyx Bugfix for similarity return types (#10051) 2022-01-20 11:40:46 +01:00
lookups.py Fix issues for Mypy 0.950 and Pydantic 1.9.0 (#10786) 2022-05-25 09:33:54 +02:00
morphology.pxd Morphology/Morphologizer optimizations and refactoring (#11024) 2022-07-15 11:14:08 +02:00
morphology.pyx Morphology/Morphologizer optimizations and refactoring (#11024) 2022-07-15 11:14:08 +02:00
parts_of_speech.pxd Consolidate and freeze symbols (#11352) 2022-09-02 09:08:40 +02:00
parts_of_speech.pyx Drop Python 2.7 and 3.5 (#4828) 2019-12-22 01:53:56 +01:00
pipe_analysis.py 🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167) 2021-10-14 15:21:40 +02:00
py.typed Add py.typed 2021-03-16 09:48:31 +01:00
schemas.py Consolidate and freeze symbols (#11352) 2022-09-02 09:08:40 +02:00
scorer.py Alignment: use a simplified ragged type for performance (#10319) 2022-04-01 09:02:06 +02:00
strings.pxd StringStore refactoring (#11344) 2022-10-06 10:51:06 +02:00
strings.pyi StringStore refactoring (#11344) 2022-10-06 10:51:06 +02:00
strings.pyx StringStore refactoring (#11344) 2022-10-06 10:51:06 +02:00
structs.pxd Morphology/Morphologizer optimizations and refactoring (#11024) 2022-07-15 11:14:08 +02:00
symbols.pxd Consolidate and freeze symbols (#11352) 2022-09-02 09:08:40 +02:00
symbols.pyx Consolidate and freeze symbols (#11352) 2022-09-02 09:08:40 +02:00
tokenizer.pxd Cleanup Cython structs (#11337) 2022-08-22 15:52:24 +02:00
tokenizer.pyx Update/remove old Matcher syntax (#11370) 2022-08-30 15:40:31 +02:00
ty.py Custom component types in spacy.ty (#9469) 2021-10-21 15:31:06 +02:00
typedefs.pxd Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master 2020-11-25 11:49:34 +01:00
typedefs.pyx Tidy up rest 2017-10-27 21:07:59 +02:00
util.py Simplify and clarify enable/disable behavior of spacy.load() (#11459) 2022-09-27 14:22:36 +02:00
vectors.pyx vectors: avoid expensive comparisons between numpy ints and Python ints (#10992) 2022-06-29 12:58:31 +02:00
vocab.pxd Cleanup Cython structs (#11337) 2022-08-22 15:52:24 +02:00
vocab.pyi Cleanup Cython structs (#11337) 2022-08-22 15:52:24 +02:00
vocab.pyx Cleanup Cython structs (#11337) 2022-08-22 15:52:24 +02:00