* Add error to `get_vectors_loss` for unsupported loss function of `pretrain`
* Add missing "--loss-func" argument to pretrain docs. Update pretrain plac annotations to match docs.
* Add missing quotation marks
* Changed learning rate by its param name.
I've been searching for a while how the parameter learning rate was named, with `beta1` and `beta2` its easy as they are marked as code, but learning rate wasn't. I think writing the actual parameter name would be helpful.
* Signing SCA
* Update tokenizer.md for construction example
Self contained example. You should really say what nlp is so that the example will work as is
* Update CONTRIBUTOR_AGREEMENT.md
* Restore contributor agreement
* Adjust construction examples
* Adding support for entity_id in EntityRuler pipeline component
* Adding Spacy Contributor aggreement
* Updating EntityRuler to use string.format instead of f strings
* Update Entity Ruler to support an 'id' attribute per pattern that explicitly identifies an entity.
* Fixing tests
* Remove custom extension entity_id and use built in ent_id token attribute.
* Changing entity_id to ent_id for consistent naming
* entity_ids => ent_ids
* Removing kb, cleaning up tests, making util functions private, use rsplit instead of split
* Add check for empty input file to CLI pretrain
* Raise error if JSONL is not a dict or contains neither `tokens` nor `text` key
* Skip empty values for correct pretrain keys and log a counter as warning
* Add tests for CLI pretrain core function make_docs.
* Add a short hint for the `tokens` key to the CLI pretrain docs
* Add success message to CLI pretrain
* Update model loading to fix the tests
* Skip empty values and do not create docs out of it