spaCy/spacy
Matthew Honnibal 8656a08777
Add beam_parser and beam_ner components for v3 (#6369)
* Get basic beam tests working

* Get basic beam tests working

* Compile _beam_utils

* Remove prints

* Test beam density

* Beam parser seems to train

* Draft beam NER

* Upd beam

* Add hypothesis as dev dependency

* Implement missing is-gold-parse method

* Implement early update

* Fix state hashing

* Fix test

* Fix test

* Default to non-beam in parser constructor

* Improve oracle for beam

* Start refactoring beam

* Update test

* Refactor beam

* Update nn

* Refactor beam and weight by cost

* Update ner beam settings

* Update test

* Add __init__.pxd

* Upd test

* Fix test

* Upd test

* Fix test

* Remove ring buffer history from StateC

* WIP change arc-eager transitions

* Add state tests

* Support ternary sent start values

* Fix arc eager

* Fix NER

* Pass oracle cut size for beam

* Fix ner test

* Fix beam

* Improve StateC.clone

* Improve StateClass.borrow

* Work directly with StateC, not StateClass

* Remove print statements

* Fix state copy

* Improve state class

* Refactor parser oracles

* Fix arc eager oracle

* Fix arc eager oracle

* Use a vector to implement the stack

* Refactor state data structure

* Fix alignment of sent start

* Add get_aligned_sent_starts method

* Add test for ae oracle when bad sentence starts

* Fix sentence segment handling

* Avoid Reduce that inserts illegal sentence

* Update preset SBD test

* Fix test

* Remove prints

* Fix sent starts in Example

* Improve python API of StateClass

* Tweak comments and debug output of arc eager

* Upd test

* Fix state test

* Fix state test
2020-12-13 09:08:32 +08:00
..
cli Include custom code via spacy package command (#6531) 2020-12-10 20:36:46 +08:00
displacy Refactor Docs.is_ flags (#6044) 2020-09-17 00:14:01 +02:00
lang Remove tag map 2020-12-09 11:13:49 +11:00
matcher Add SPACY as a Matcher attribute (#6463) 2020-11-30 09:34:50 +08:00
ml Bugfix multi-label textcat reproducibility (#6481) 2020-12-09 06:29:15 +08:00
pipeline Add beam_parser and beam_ner components for v3 (#6369) 2020-12-13 09:08:32 +08:00
tests Add beam_parser and beam_ner components for v3 (#6369) 2020-12-13 09:08:32 +08:00
tokens Fix retokenizer 2020-12-09 11:29:55 +11:00
training Add beam_parser and beam_ner components for v3 (#6369) 2020-12-13 09:08:32 +08:00
__init__.pxd
__init__.py require_cpu functionality (#6336) 2020-12-08 14:42:40 +08:00
__main__.py Tidy up 2020-06-22 00:45:40 +02:00
about.py Set version to v2.3.4 2020-11-26 08:48:52 +01:00
attrs.pxd Merge branch 'develop' into master-tmp 2020-05-21 18:39:06 +02:00
attrs.pyx Merge branch 'develop' into master-tmp 2020-05-21 18:39:06 +02:00
compat.py Use Literal type for nr_feature_tokens 2020-09-23 16:00:03 +02:00
default_config_pretraining.cfg pretrain architectures (#6451) 2020-12-08 14:41:03 +08:00
default_config.cfg Add nlp.batch_size setting 2020-12-09 09:13:26 +01:00
errors.py Merge pull request #6503 from adrianeboyd/feature/lemmatizer-rule-warning-pos 2020-12-09 11:34:16 +11:00
glossary.py unicode -> str consistency 2020-05-24 17:20:58 +02:00
kb.pxd Revert added_strings change (#6236) 2020-10-10 18:55:07 +02:00
kb.pyx Revert added_strings change (#6236) 2020-10-10 18:55:07 +02:00
language.py Update docstring for Language.evaluate 2020-12-09 10:21:39 +01:00
lexeme.pxd Fix Lexeme.from_ptr 2020-08-10 16:43:37 +02:00
lexeme.pyx Update docs links in codebase 2020-09-04 12:58:50 +02:00
lookups.py Always serialize lookups and vectors to disk 2020-10-05 09:40:20 +02:00
morphology.pxd Add Lemmatizer and simplify related components (#5848) 2020-08-07 15:27:13 +02:00
morphology.pyx Add _ as a symbol (#6153) 2020-09-27 22:20:14 +02:00
parts_of_speech.pxd
parts_of_speech.pyx
pipe_analysis.py Tidy up and auto-format 2020-09-29 21:39:28 +02:00
schemas.py Include custom code via spacy package command (#6531) 2020-12-10 20:36:46 +08:00
scorer.py Clean up 3rd party license info (#6478) 2020-12-02 10:15:23 +01:00
strings.pxd Remove 'cleanup' of strings (#6007) 2020-09-01 16:12:15 +02:00
strings.pyx Update docs links in codebase 2020-09-04 12:58:50 +02:00
structs.pxd Clean up MorphAnalysisC struct (#6146) 2020-09-25 15:56:48 +02:00
symbols.pxd Add _ as a symbol (#6153) 2020-09-27 22:20:14 +02:00
symbols.pyx Add _ as a symbol (#6153) 2020-09-27 22:20:14 +02:00
tokenizer.pxd Simplify specials and cache checks (#6012) 2020-09-03 09:42:49 +02:00
tokenizer.pyx Merge branch 'develop' into pr/6444 2020-12-09 11:04:03 +11:00
typedefs.pxd Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master 2020-11-25 11:49:34 +01:00
typedefs.pyx
util.py Merge branch 'master' into pr/6444 2020-12-09 11:09:40 +11:00
vectors.pyx Update docs links in codebase 2020-09-04 12:58:50 +02:00
vocab.pxd Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master 2020-11-25 11:49:34 +01:00
vocab.pyx Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master 2020-11-25 11:49:34 +01:00