spaCy/spacy
adrianeboyd 90c52128dc Improve train CLI with base model (#4911)
Improve train CLI with a provided base model so that you can:

* add a new component
* extend an existing component
* replace an existing component

When the final model and best model are saved, reenable any disabled
components and merge the meta information to include the full pipeline
and accuracy information for all components in the base model plus the
newly added components if needed.
2020-01-16 01:58:51 +01:00
..
cli Improve train CLI with base model (#4911) 2020-01-16 01:58:51 +01:00
data Make spacy/data a package 2017-03-18 20:04:22 +01:00
displacy stop rendering mathjax by default in displacy (#4840) 2020-01-01 13:15:05 +01:00
lang Add CJK to character classes (#4884) 2020-01-08 16:50:19 +01:00
matcher Fix int value handling in Matcher (#4749) 2019-12-06 19:22:57 +01:00
ml Replace function registries with catalogue (#4584) 2019-11-07 11:45:22 +01:00
pipeline Reduce mem usage in training Entity Linker (#4811) 2020-01-06 14:59:50 +01:00
syntax bugfix typo conv_window 2020-01-14 09:02:58 +01:00
tests Add CJK to character classes (#4884) 2020-01-08 16:50:19 +01:00
tokens serialize ENT_ID (#4852) 2020-01-06 14:57:34 +01:00
__init__.pxd * Seems to be working after refactor. Need to wire up more POS tag features, and wire up save/load of POS tags. 2014-10-24 02:23:42 +11:00
__init__.py Replace function registries with catalogue (#4584) 2019-11-07 11:45:22 +01:00
__main__.py Use latest wasabi 2019-11-04 02:38:45 +01:00
_align.pyx Fixes typos (#4843) 2019-12-29 14:24:13 +01:00
_ml.py Put Tok2Vec refactor behind feature flag (#4563) 2019-10-31 15:01:15 +01:00
about.py Update version [ci skip] 2019-11-21 18:19:37 +01:00
analysis.py Support span._. in component decorator attrs (#4555) 2019-10-30 17:19:36 +01:00
attrs.pxd serialize ENT_ID (#4852) 2020-01-06 14:57:34 +01:00
attrs.pyx serialize ENT_ID (#4852) 2020-01-06 14:57:34 +01:00
compat.py Replace function registries with catalogue (#4584) 2019-11-07 11:45:22 +01:00
errors.py add warning in debug_data for punctuation in entities (#4853) 2020-01-06 14:59:28 +01:00
glossary.py Update tag maps and docs for English and German (#4501) 2019-10-24 12:56:05 +02:00
gold.pxd Merge changes from master 2019-08-21 14:18:52 +02:00
gold.pyx facilitate larger training files (#4827) 2019-12-21 21:12:19 +01:00
kb.pxd rename entity frequency 2019-07-19 17:40:28 +02:00
kb.pyx More robust set entities method in KB (#4794) 2019-12-13 10:45:29 +01:00
language.py serialize ENT_ID (#4852) 2020-01-06 14:57:34 +01:00
lemmatizer.py Refactor lemmatizer and data table integration (#4353) 2019-10-01 21:36:03 +02:00
lexeme.pxd 💫 Support lexical attributes in retokenizer attrs (closes #2390) (#3325) 2019-02-24 21:13:51 +01:00
lexeme.pyx Alphanumeric -> alphabetic [ci skip] 2019-10-06 13:30:01 +02:00
lookups.py Refactor lemmatizer and data table integration (#4353) 2019-10-01 21:36:03 +02:00
morphology.pxd annotate kb_id through ents in doc 2019-03-22 11:36:44 +01:00
morphology.pyx Improve Morphology errors (#4314) 2019-09-21 14:37:06 +02:00
parts_of_speech.pxd Add support for Universal Dependencies v2.0 2017-03-03 13:17:34 +01:00
parts_of_speech.pyx Tidy up rest 2017-10-27 21:07:59 +02:00
scorer.py Auto-format 2019-11-20 13:15:24 +01:00
strings.pxd Try to fix StringStore clean up (see #1506) 2017-11-11 03:11:27 +03:00
strings.pyx Merge branch 'master' into feature/lemmatizer 2019-03-16 13:44:22 +01:00
structs.pxd Replace Entity/MatchStruct with SpanC (#4459) 2019-10-18 11:01:47 +02:00
symbols.pxd serialize ENT_ID (#4852) 2020-01-06 14:57:34 +01:00
symbols.pyx serialize ENT_ID (#4852) 2020-01-06 14:57:34 +01:00
tokenizer.pxd Flush tokenizer cache when necessary (#4258) 2019-09-08 20:52:46 +02:00
tokenizer.pyx Detect more empty matches in tokenizer.explain() (#4675) 2019-11-20 16:31:29 +01:00
typedefs.pxd Work on changing StringStore to return hashes. 2017-05-28 12:36:27 +02:00
typedefs.pyx Tidy up rest 2017-10-27 21:07:59 +02:00
util.py Auto-exclude disabled when calling from_disk during load (#4708) 2019-11-25 16:01:22 +01:00
vectors.pyx Make vectors.find() return keys in correct order (#4691) 2019-11-21 16:58:32 +01:00
vocab.pxd 💫 WIP: Basic lookup class scaffolding and JSON for all lemmati… (#4167) 2019-08-22 14:21:32 +02:00
vocab.pyx Agnostic vocab array fix (#4680) 2019-11-23 14:59:52 +01:00