spaCy/spacy/training
Adriane Boyd f98b41c390
Add vector deduplication (#10551)
* Add vector deduplication

* Add `Vocab.deduplicate_vectors()`
* Always run deduplication in `spacy init vectors`
* Clean up a few vector-related error messages and docs examples

* Always unique with numpy

* Fix types
2022-03-30 08:54:23 +02:00
..
converters Auto-format code with black (#10377) 2022-02-25 10:00:21 +01:00
__init__.pxd Renaming gold & annotation_setter (#6042) 2020-09-09 10:31:03 +02:00
__init__.py 🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167) 2021-10-14 15:21:40 +02:00
align.pyx Fix alignment for 1-to-1 tokens and lowercasing (#6476) 2020-12-08 14:25:16 +08:00
alignment.py Replace pytokenizations with internal alignment (#6293) 2020-11-03 16:24:38 +01:00
augment.py Add whitespace and combined augmenters (#10170) 2022-02-17 15:54:09 +01:00
batchers.py 🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167) 2021-10-14 15:21:40 +02:00
callbacks.py 🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167) 2021-10-14 15:21:40 +02:00
corpus.py Auto-format code with black (#9664) 2021-11-12 10:00:03 +01:00
example.pxd Make a pre-check to speed up alignment cache (#6139) 2020-09-24 18:13:39 +02:00
example.pyx Fix get_matching_ents (#10451) 2022-03-07 16:56:57 +01:00
gold_io.pyx Fix is_sent_start when converting from JSON (fix #7635) (#7655) 2021-04-08 18:24:52 +10:00
initialize.py Add vector deduplication (#10551) 2022-03-30 08:54:23 +02:00
iob_utils.py 🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167) 2021-10-14 15:21:40 +02:00
loggers.py Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.2-1 2021-10-26 11:53:50 +02:00
loop.py 🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167) 2021-10-14 15:21:40 +02:00
pretrain.py Clarify how to fill in init_tok2vec after pretraining (#9639) 2021-11-18 15:38:30 +01:00