From c376c2e1223fe35e8635738b97dd10575496f499 Mon Sep 17 00:00:00 2001 From: svlandeg Date: Fri, 31 Jul 2020 18:19:17 +0200 Subject: [PATCH] add docs & examples for debug_model --- website/docs/api/cli.md | 134 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 131 insertions(+), 3 deletions(-) diff --git a/website/docs/api/cli.md b/website/docs/api/cli.md index 88b04d759..68aff4c46 100644 --- a/website/docs/api/cli.md +++ b/website/docs/api/cli.md @@ -147,8 +147,8 @@ config from being resolved. This means that you may not see all validation errors at once and some issues are only shown once previous errors have been fixed. -Instead of specifying all required settings in the config file, you can rely -on an auto-fill functionality that uses spaCy's built-in defaults. The resulting +Instead of specifying all required settings in the config file, you can rely on +an auto-fill functionality that uses spaCy's built-in defaults. The resulting full config can be written to file and used in downstream training tasks. ```bash @@ -381,7 +381,135 @@ will not be available. | `--help`, `-h` | flag | Show help message and available arguments. | | overrides | | Config parameters to override. Should be options starting with `--` that correspond to the config section and value to override, e.g. `--training.use_gpu 1`. | - + + +### debug model {#debug-model} + +Debug a Thinc [`Model`](https://thinc.ai/docs/api-model) by running it on a +sample text and checking how it updates its internal weights and parameters. + +```bash +$ python -m spacy debug model [config_path] [component] [--layers] [-DIM] [-PAR] [-GRAD] [-ATTR] [-P0] [-P1] [-P2] [P3] [--gpu_id] +``` + +> #### Example 1 +> +> ```bash +> $ python -m spacy debug model ./config.cfg tagger -P0 +> ``` + + + +``` +ℹ Using CPU +ℹ Fixing random seed: 0 +ℹ Analysing model with ID 62 + +========================== STEP 0 - before training ========================== +ℹ Layer 0: model ID 62: +'extract_features>>list2ragged>>with_array-ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed>>with_array-maxout>>layernorm>>dropout>>ragged2list>>with_array-residual>>residual>>residual>>residual>>with_array-softmax' +ℹ Layer 1: model ID 59: +'extract_features>>list2ragged>>with_array-ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed>>with_array-maxout>>layernorm>>dropout>>ragged2list>>with_array-residual>>residual>>residual>>residual' +ℹ Layer 2: model ID 61: 'with_array-softmax' +ℹ Layer 3: model ID 24: +'extract_features>>list2ragged>>with_array-ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed>>with_array-maxout>>layernorm>>dropout>>ragged2list' +ℹ Layer 4: model ID 58: 'with_array-residual>>residual>>residual>>residual' +ℹ Layer 5: model ID 60: 'softmax' +ℹ Layer 6: model ID 13: 'extract_features' +ℹ Layer 7: model ID 14: 'list2ragged' +ℹ Layer 8: model ID 16: +'with_array-ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed' +ℹ Layer 9: model ID 22: 'with_array-maxout>>layernorm>>dropout' +ℹ Layer 10: model ID 23: 'ragged2list' +ℹ Layer 11: model ID 57: 'residual>>residual>>residual>>residual' +ℹ Layer 12: model ID 15: +'ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed' +ℹ Layer 13: model ID 21: 'maxout>>layernorm>>dropout' +ℹ Layer 14: model ID 32: 'residual' +ℹ Layer 15: model ID 40: 'residual' +ℹ Layer 16: model ID 48: 'residual' +ℹ Layer 17: model ID 56: 'residual' +ℹ Layer 18: model ID 3: 'ints-getitem>>hashembed' +ℹ Layer 19: model ID 6: 'ints-getitem>>hashembed' +ℹ Layer 20: model ID 9: 'ints-getitem>>hashembed' +... +``` + + + +In this example log, we just print the name of each layer after creation of the +model ("Step 0"), which helps us to understand the internal structure of the +Neural Network, and to focus on specific layers that we want to inspect further +(see next example). + +> #### Example 2 +> +> ```bash +> $ python -m spacy debug model ./config.cfg tagger -l "5,15" -DIM -PAR -P0 -P1 -P2 +> ``` + + + +``` +ℹ Using CPU +ℹ Fixing random seed: 0 +ℹ Analysing model with ID 62 + +========================= STEP 0 - before training ========================= +ℹ Layer 5: model ID 60: 'softmax' +ℹ - dim nO: None +ℹ - dim nI: 96 +ℹ - param W: None +ℹ - param b: None +ℹ Layer 15: model ID 40: 'residual' +ℹ - dim nO: None +ℹ - dim nI: None + +======================= STEP 1 - after initialization ======================= +ℹ Layer 5: model ID 60: 'softmax' +ℹ - dim nO: 4 +ℹ - dim nI: 96 +ℹ - param W: (4, 96) - sample: [0. 0. 0. 0. 0.] +ℹ - param b: (4,) - sample: [0. 0. 0. 0.] +ℹ Layer 15: model ID 40: 'residual' +ℹ - dim nO: 96 +ℹ - dim nI: None + +========================== STEP 2 - after training ========================== +ℹ Layer 5: model ID 60: 'softmax' +ℹ - dim nO: 4 +ℹ - dim nI: 96 +ℹ - param W: (4, 96) - sample: [ 0.00283958 -0.00294119 0.00268396 -0.00296219 +-0.00297141] +ℹ - param b: (4,) - sample: [0.00300002 0.00300002 0.00300002 0.00300002] +ℹ Layer 15: model ID 40: 'residual' +ℹ - dim nO: 96 +ℹ - dim nI: None +``` + + + +In this example log, we see how initialization of the model (Step 1) propagates +the correct values for the `nI` (input) and `nO` (output) dimensions of the +various layers. In the `softmax` layer, this step also defines the `W` matrix as +an all-zero matrix determined by the `nO` and `nI` dimensions. After a first +training step (Step 2), this matrix has clearly updated its values through the +training feedback loop. + +| Argument | Type | Default | Description | +| ----------------------- | ---------- | ------- | ---------------------------------------------------------------------------------------------------- | +| `config_path` | positional | | Path to [training config](/api/data-formats#config) file containing all settings and hyperparameters. | +| `component` | positional | | Name of the pipeline component of which the model should be analysed. | +| `--layers`, `-l` | option | | Comma-separated names of layer IDs to print. | +| `--dimensions`, `-DIM` | option | `False` | Show dimensions of each layer. | +| `--parameters`, `-PAR` | option | `False` | Show parameters of each layer. | +| `--gradients`, `-GRAD` | option | `False` | Show gradients of each layer. | +| `--attributes`, `-ATTR` | option | `False` | Show attributes of each layer. | +| `--print-step0`, `-P0` | option | `False` | Print model before training. | +| `--print-step1`, `-P1` | option | `False` | Print model after initialization. | +| `--print-step2`, `-P2` | option | `False` | Print model after training. | +| `--print-step3`, `-P3` | option | `False` | Print final predictions. | +| `--help`, `-h` | flag | | Show help message and available arguments. | ## Train {#train}