add docs & examples for debug_model

This commit is contained in:
svlandeg 2020-07-31 18:19:17 +02:00
parent a52e1f99ff
commit c376c2e122

View File

@ -147,8 +147,8 @@ config from being resolved. This means that you may not see all validation
errors at once and some issues are only shown once previous errors have been
fixed.
Instead of specifying all required settings in the config file, you can rely
on an auto-fill functionality that uses spaCy's built-in defaults. The resulting
Instead of specifying all required settings in the config file, you can rely on
an auto-fill functionality that uses spaCy's built-in defaults. The resulting
full config can be written to file and used in downstream training tasks.
```bash
@ -381,7 +381,135 @@ will not be available.
| `--help`, `-h` | flag | Show help message and available arguments. |
| overrides | | Config parameters to override. Should be options starting with `--` that correspond to the config section and value to override, e.g. `--training.use_gpu 1`. |
<!-- TODO: document debug profile and debug model? -->
<!-- TODO: document debug profile?-->
### debug model {#debug-model}
Debug a Thinc [`Model`](https://thinc.ai/docs/api-model) by running it on a
sample text and checking how it updates its internal weights and parameters.
```bash
$ python -m spacy debug model [config_path] [component] [--layers] [-DIM] [-PAR] [-GRAD] [-ATTR] [-P0] [-P1] [-P2] [P3] [--gpu_id]
```
> #### Example 1
>
> ```bash
> $ python -m spacy debug model ./config.cfg tagger -P0
> ```
<Accordion title="Example 1 output" spaced>
```
Using CPU
Fixing random seed: 0
Analysing model with ID 62
========================== STEP 0 - before training ==========================
Layer 0: model ID 62:
'extract_features>>list2ragged>>with_array-ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed>>with_array-maxout>>layernorm>>dropout>>ragged2list>>with_array-residual>>residual>>residual>>residual>>with_array-softmax'
Layer 1: model ID 59:
'extract_features>>list2ragged>>with_array-ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed>>with_array-maxout>>layernorm>>dropout>>ragged2list>>with_array-residual>>residual>>residual>>residual'
Layer 2: model ID 61: 'with_array-softmax'
Layer 3: model ID 24:
'extract_features>>list2ragged>>with_array-ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed>>with_array-maxout>>layernorm>>dropout>>ragged2list'
Layer 4: model ID 58: 'with_array-residual>>residual>>residual>>residual'
Layer 5: model ID 60: 'softmax'
Layer 6: model ID 13: 'extract_features'
Layer 7: model ID 14: 'list2ragged'
Layer 8: model ID 16:
'with_array-ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed'
Layer 9: model ID 22: 'with_array-maxout>>layernorm>>dropout'
Layer 10: model ID 23: 'ragged2list'
Layer 11: model ID 57: 'residual>>residual>>residual>>residual'
Layer 12: model ID 15:
'ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed|ints-getitem>>hashembed'
Layer 13: model ID 21: 'maxout>>layernorm>>dropout'
Layer 14: model ID 32: 'residual'
Layer 15: model ID 40: 'residual'
Layer 16: model ID 48: 'residual'
Layer 17: model ID 56: 'residual'
Layer 18: model ID 3: 'ints-getitem>>hashembed'
Layer 19: model ID 6: 'ints-getitem>>hashembed'
Layer 20: model ID 9: 'ints-getitem>>hashembed'
...
```
</Accordion>
In this example log, we just print the name of each layer after creation of the
model ("Step 0"), which helps us to understand the internal structure of the
Neural Network, and to focus on specific layers that we want to inspect further
(see next example).
> #### Example 2
>
> ```bash
> $ python -m spacy debug model ./config.cfg tagger -l "5,15" -DIM -PAR -P0 -P1 -P2
> ```
<Accordion title="Example 2 output" spaced>
```
Using CPU
Fixing random seed: 0
Analysing model with ID 62
========================= STEP 0 - before training =========================
Layer 5: model ID 60: 'softmax'
- dim nO: None
- dim nI: 96
- param W: None
- param b: None
Layer 15: model ID 40: 'residual'
- dim nO: None
- dim nI: None
======================= STEP 1 - after initialization =======================
Layer 5: model ID 60: 'softmax'
- dim nO: 4
- dim nI: 96
- param W: (4, 96) - sample: [0. 0. 0. 0. 0.]
- param b: (4,) - sample: [0. 0. 0. 0.]
Layer 15: model ID 40: 'residual'
- dim nO: 96
- dim nI: None
========================== STEP 2 - after training ==========================
Layer 5: model ID 60: 'softmax'
- dim nO: 4
- dim nI: 96
- param W: (4, 96) - sample: [ 0.00283958 -0.00294119 0.00268396 -0.00296219
-0.00297141]
- param b: (4,) - sample: [0.00300002 0.00300002 0.00300002 0.00300002]
Layer 15: model ID 40: 'residual'
- dim nO: 96
- dim nI: None
```
</Accordion>
In this example log, we see how initialization of the model (Step 1) propagates
the correct values for the `nI` (input) and `nO` (output) dimensions of the
various layers. In the `softmax` layer, this step also defines the `W` matrix as
an all-zero matrix determined by the `nO` and `nI` dimensions. After a first
training step (Step 2), this matrix has clearly updated its values through the
training feedback loop.
| Argument | Type | Default | Description |
| ----------------------- | ---------- | ------- | ---------------------------------------------------------------------------------------------------- |
| `config_path` | positional | | Path to [training config](/api/data-formats#config) file containing all settings and hyperparameters. |
| `component` | positional | | Name of the pipeline component of which the model should be analysed. |
| `--layers`, `-l` | option | | Comma-separated names of layer IDs to print. |
| `--dimensions`, `-DIM` | option | `False` | Show dimensions of each layer. |
| `--parameters`, `-PAR` | option | `False` | Show parameters of each layer. |
| `--gradients`, `-GRAD` | option | `False` | Show gradients of each layer. |
| `--attributes`, `-ATTR` | option | `False` | Show attributes of each layer. |
| `--print-step0`, `-P0` | option | `False` | Print model before training. |
| `--print-step1`, `-P1` | option | `False` | Print model after initialization. |
| `--print-step2`, `-P2` | option | `False` | Print model after training. |
| `--print-step3`, `-P3` | option | `False` | Print final predictions. |
| `--help`, `-h` | flag | | Show help message and available arguments. |
## Train {#train}