From dd047abb5f06a3ab9eb60051a83587af847f84f5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marcus=20Bl=C3=A4ttermann?= Date: Mon, 14 Nov 2022 21:06:38 +0100 Subject: [PATCH] Update comment syntax in MDX --- website/docs/api/pipe.mdx | 2 +- website/docs/models/index.mdx | 2 +- .../docs/usage/embeddings-transformers.mdx | 13 ++++----- website/docs/usage/facts-figures.mdx | 6 ++-- website/docs/usage/projects.mdx | 28 +++++++++---------- website/docs/usage/rule-based-matching.mdx | 2 +- website/docs/usage/saving-loading.mdx | 2 +- website/docs/usage/training.mdx | 6 +--- website/docs/usage/v3-1.mdx | 2 +- 9 files changed, 27 insertions(+), 36 deletions(-) diff --git a/website/docs/api/pipe.mdx b/website/docs/api/pipe.mdx index 328542435..c2777edf0 100644 --- a/website/docs/api/pipe.mdx +++ b/website/docs/api/pipe.mdx @@ -12,7 +12,7 @@ spaCy pipeline. See the docs on [writing trainable components](/usage/processing-pipelines#trainable-components) for how to use the `TrainablePipe` base class to implement custom components. - +{/* TODO: Pipe vs TrainablePipe, check methods below (all renamed to TrainablePipe for now) */} > #### Why is it implemented in Cython? > diff --git a/website/docs/models/index.mdx b/website/docs/models/index.mdx index 518aeac34..2727cabd7 100644 --- a/website/docs/models/index.mdx +++ b/website/docs/models/index.mdx @@ -7,7 +7,7 @@ menu: - ['Pipeline Design', 'design'] --- - +{/* TODO: include interactive demo */} ### Quickstart {hidden="true"} diff --git a/website/docs/usage/embeddings-transformers.mdx b/website/docs/usage/embeddings-transformers.mdx index 5e189c2e2..0b888ed8a 100644 --- a/website/docs/usage/embeddings-transformers.mdx +++ b/website/docs/usage/embeddings-transformers.mdx @@ -170,7 +170,7 @@ factory = "ner" @architectures = "spacy.MaxoutWindowEncoder.v2" ``` - +{/* TODO: Once rehearsal is tested, mention it here. */} ## Using transformer models {id="transformers"} @@ -309,14 +309,13 @@ of objects by referring to creation functions, including functions you register yourself. For details on how to get started with training your own model, check out the [training quickstart](/usage/training#quickstart). - +{/* */} The `[components]` section in the [`config.cfg`](/api/data-formats#config) describes the pipeline components and the settings used to construct them, diff --git a/website/docs/usage/facts-figures.mdx b/website/docs/usage/facts-figures.mdx index e16f19417..4556d5637 100644 --- a/website/docs/usage/facts-figures.mdx +++ b/website/docs/usage/facts-figures.mdx @@ -57,7 +57,7 @@ spaCy v3.0 introduces transformer-based pipelines that bring spaCy's accuracy right up to **current state-of-the-art**. You can also use a CPU-optimized pipeline, which is less accurate but much cheaper to run. - +{/* TODO: update benchmarks and intro */} > #### Evaluation details > @@ -117,6 +117,4 @@ comments. - +{/* TODO: ## Citing spaCy {id="citation"} */} diff --git a/website/docs/usage/projects.mdx b/website/docs/usage/projects.mdx index c844af630..aa4b6a9fc 100644 --- a/website/docs/usage/projects.mdx +++ b/website/docs/usage/projects.mdx @@ -392,7 +392,7 @@ For example, a command for training a pipeline may depend on a it will export a directory `model-best`, which you can then re-use in other commands. - +{/* prettier-ignore */} ```yaml ### project.yml commands: @@ -445,7 +445,7 @@ directory: > #### project.yml > -> +> {/* prettier-ignore */} > ```yaml > directories: ['assets', 'configs', 'corpus', 'metas', 'metrics', 'notebooks', 'packages', 'scripts', 'training'] > ``` @@ -549,7 +549,7 @@ override settings on the command line – for example using `--vars.batch_size`. > everything with the same Python (not some other Python installed on your > system). It also normalizes references to `python3`, `pip3` and `pip`. - +{/* prettier-ignore */} ```yaml ### project.yml vars: @@ -618,7 +618,7 @@ up to date. Note that the contents of an existing file will be **replaced** if no existing auto-generated docs are found. If you want spaCy to ignore a file and not update -it, you can add the comment marker `` anywhere in +it, you can add the comment marker `{/* SPACY PROJECT: IGNORE */}` anywhere in your markup. @@ -691,9 +691,9 @@ according to a hash of the command string and the command's dependencies. Finally, within those directories are files, named according to an MD5 hash of their contents. - +{/* TODO: update with actual real example? */} - +{/* prettier-ignore */} ```yaml └── urlencoded_file_path # Path of original file ├── some_command_hash # Hash of command you ran @@ -818,9 +818,7 @@ workflows, but only one can be tracked by DVC. - +{/* { TODO: } */} --- @@ -853,7 +851,7 @@ collected with Prodigy and training a spaCy pipeline: > $ python -m spacy project run all > ``` - +{/* prettier-ignore */} ```yaml ### project.yml vars: @@ -895,7 +893,7 @@ different portions of the data, e.g. 25%, 50%, 75% and 100%. As a rule of thumb, if accuracy increases in the last segment, this could indicate that collecting more annotations of the same type might improve the model further. - +{/* prettier-ignore */} ```yaml ### project.yml (excerpt) - name: "train_curve" @@ -934,7 +932,7 @@ package helps you integrate spaCy visualizations into your Streamlit apps and quickly spin up demos to explore your pipelines interactively. It includes a full embedded visualizer, as well as individual components. - +{/* TODO: update once version is stable */} > #### Installation > @@ -963,7 +961,7 @@ and explore your own custom trained pipelines. > $ python -m spacy project run visualize > ``` - +{/* prettier-ignore */} ```yaml ### project.yml commands: @@ -1008,7 +1006,7 @@ query your API from Python and JavaScript (Vanilla JS and React). > $ python -m spacy project run serve > ``` - +{/* prettier-ignore */} ```yaml ### project.yml - name: "serve" @@ -1114,7 +1112,7 @@ packaged pipeline to the hub. You can either run this as a manual step, or automatically as part of a workflow. Make sure to set `--build wheel` when running `spacy package` to build a wheel file for your pipeline package. - +{/* prettier-ignore */} ```yaml ### project.yml - name: "push_to_hub" diff --git a/website/docs/usage/rule-based-matching.mdx b/website/docs/usage/rule-based-matching.mdx index 1a8bd8f42..6f8005245 100644 --- a/website/docs/usage/rule-based-matching.mdx +++ b/website/docs/usage/rule-based-matching.mdx @@ -1429,7 +1429,7 @@ rules included! ### Using a large number of phrase patterns {id="entityruler-large-phrase-patterns",version="2.2.4"} - +{/* TODO: double-check that this still works if the ruler is added to the pipeline on creation, and include suggestion if needed */} When using a large amount of **phrase patterns** (roughly > 10000) it's useful to understand how the `add_patterns` function of the entity ruler works. For diff --git a/website/docs/usage/saving-loading.mdx b/website/docs/usage/saving-loading.mdx index 4b57cfaca..b02cf563a 100644 --- a/website/docs/usage/saving-loading.mdx +++ b/website/docs/usage/saving-loading.mdx @@ -292,7 +292,7 @@ custom components to spaCy automatically. - +{/* ## Initializing components with data {id="initialization",version="3"} */} ## Using entry points {id="entry-points",version="2.1"} diff --git a/website/docs/usage/training.mdx b/website/docs/usage/training.mdx index 8f910a932..d009bff69 100644 --- a/website/docs/usage/training.mdx +++ b/website/docs/usage/training.mdx @@ -1439,10 +1439,7 @@ def filter_batch(size: int) -> Callable[[Iterable[Example]], Iterator[List[Examp return create_filtered_batches ``` - +{/* TODO: Custom corpus class, Minibatching */} ### Data augmentation {id="data-augmentation"} @@ -1483,7 +1480,6 @@ typically loaded from a JSON file. There are two types of orth variant rules: `"single"` for single tokens that should be replaced (e.g. hyphens) and `"paired"` for pairs of tokens (e.g. quotes). - ```json ### orth_variants.json { diff --git a/website/docs/usage/v3-1.mdx b/website/docs/usage/v3-1.mdx index c9b8305f5..db9ebafbc 100644 --- a/website/docs/usage/v3-1.mdx +++ b/website/docs/usage/v3-1.mdx @@ -116,7 +116,7 @@ train_doc.spans["incorrect_spans"] = [ ] ``` - +{/* TODO: more details and/or example project? */} ### New pipeline packages for Catalan and Danish {id="pipeline-packages"}