* account for NER labels with a hyphen in the name
* cleanup
* fix docstring
* add return type to helper method
* shorter method and few more occurrences
* user helper method across repo
* fix circular import
* partial revert to avoid circular import
* Support a cfg field in transition system
* Make NER 'has gold' check use right alignment for span
* Pass 'negative_samples_key' property into NER transition system
* Add field for negative samples to NER transition system
* Check neg_key in NER has_gold
* Support negative examples in NER oracle
* Test for negative examples in NER
* Fix name of config variable in NER
* Remove vestiges of old-style partial annotation
* Remove obsolete tests
* Add comment noting lack of support for negative samples in parser
* Additions to "neg examples" PR (#8201)
* add custom error and test for deprecated format
* add test for unlearning an entity
* add break also for Begin's cost
* add negative_samples_key property on Parser
* rename
* extend docs & fix some older docs issues
* add subclass constructors, clean up tests, fix docs
* add flaky test with ValueError if gold parse was not found
* remove ValueError if n_gold == 0
* fix docstring
* Hack in environment variables to try out training
* Remove hack
* Remove NER hack, and support 'negative O' samples
* Fix O oracle
* Fix transition parser
* Remove 'not O' from oracle
* Fix NER oracle
* check for spans in both gold.ents and gold.spans and raise if so, to prevent memory access violation
* use set instead of list in consistency check
Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Preserve existing ENT_KB_ID annotation in NER
Preserve `ent_kb_id` annotation on existing entity spans, which is not
preserved by the transition system.
* Simplify kb_id assignment
* Simplify further
* clean up of ner tests
* beam_parser tests
* implement get_beam_parses and scored_parses for the dep parser
* we don't have to add the parse if there are no arcs
* small fixes and formatting
* bring test_issue4313 up-to-date, currently fails
* formatting
* add get_beam_parses method back
* add scored_ents function
* delete tag map
* Get basic beam tests working
* Get basic beam tests working
* Compile _beam_utils
* Remove prints
* Test beam density
* Beam parser seems to train
* Draft beam NER
* Upd beam
* Add hypothesis as dev dependency
* Implement missing is-gold-parse method
* Implement early update
* Fix state hashing
* Fix test
* Fix test
* Default to non-beam in parser constructor
* Improve oracle for beam
* Start refactoring beam
* Update test
* Refactor beam
* Update nn
* Refactor beam and weight by cost
* Update ner beam settings
* Update test
* Add __init__.pxd
* Upd test
* Fix test
* Upd test
* Fix test
* Remove ring buffer history from StateC
* WIP change arc-eager transitions
* Add state tests
* Support ternary sent start values
* Fix arc eager
* Fix NER
* Pass oracle cut size for beam
* Fix ner test
* Fix beam
* Improve StateC.clone
* Improve StateClass.borrow
* Work directly with StateC, not StateClass
* Remove print statements
* Fix state copy
* Improve state class
* Refactor parser oracles
* Fix arc eager oracle
* Fix arc eager oracle
* Use a vector to implement the stack
* Refactor state data structure
* Fix alignment of sent start
* Add get_aligned_sent_starts method
* Add test for ae oracle when bad sentence starts
* Fix sentence segment handling
* Avoid Reduce that inserts illegal sentence
* Update preset SBD test
* Fix test
* Remove prints
* Fix sent starts in Example
* Improve python API of StateClass
* Tweak comments and debug output of arc eager
* Upd test
* Fix state test
* Fix state test
* moving syntax folder to _parser_internals
* moving nn_parser and transition_system
* move nn_parser and transition_system out of internals folder
* moving nn_parser code into transition_system file
* rename transition_system to transition_parser
* moving parser_model and _state to ml
* move _state back to internals
* The Parser now inherits from Pipe!
* small code fixes
* removing unnecessary imports
* remove link_vectors_to_models
* transition_system to internals folder
* little bit more cleanup
* newlines