Update docs [ci skip]

This commit is contained in:
Ines Montani 2020-08-31 16:39:53 +02:00
parent 97ffb4ed05
commit bca6bf8dda
3 changed files with 38 additions and 30 deletions

View File

@ -362,8 +362,8 @@ loss and the accuracy scores on the development set.
There are two built-in logging functions: a logger printing results to the There are two built-in logging functions: a logger printing results to the
console in tabular format (which is the default), and one that also sends the console in tabular format (which is the default), and one that also sends the
results to a [Weights & Biases](https://www.wandb.com/) dashboard. results to a [Weights & Biases](https://www.wandb.com/) dashboard. Instead of
Instead of using one of the built-in loggers listed here, you can also using one of the built-in loggers listed here, you can also
[implement your own](/usage/training#custom-logging). [implement your own](/usage/training#custom-logging).
> #### Example config > #### Example config
@ -394,11 +394,16 @@ memory utilization, network traffic, disk IO, GPU statistics, etc. This will
also include information such as your hostname and operating system, as well as also include information such as your hostname and operating system, as well as
the location of your Python executable. the location of your Python executable.
Note that by default, the full (interpolated) training config file is sent over <Infobox variant="warning">
to the W&B dashboard. If you prefer to exclude certain information such as path
names, you can list those fields in "dot notation" in the `remove_config_values` Note that by default, the full (interpolated)
parameter. These fields will then be removed from the config before uploading, [training config](/usage/training#config) is sent over to the W&B dashboard. If
but will otherwise remain in the config file stored on your local system. you prefer to **exclude certain information** such as path names, you can list
those fields in "dot notation" in the `remove_config_values` parameter. These
fields will then be removed from the config before uploading, but will otherwise
remain in the config file stored on your local system.
</Infobox>
> #### Example config > #### Example config
> >

View File

@ -914,4 +914,4 @@ mattis pretium.
### Weights & Biases {#wandb} <IntegrationLogo name="wandb" width={175} height="auto" align="right" /> ### Weights & Biases {#wandb} <IntegrationLogo name="wandb" width={175} height="auto" align="right" />
<!-- TODO: decide how we want this to work? Just send results plus config from spacy evaluate in a separate command/script? --> <!-- TODO: link to WandB logger, explain that it's built-in but that you can also do other cool stuff with WandB? And then include example project (still need to decide what we want to do here) -->

View File

@ -607,8 +607,12 @@ $ python -m spacy train config.cfg --output ./output --code ./functions.py
#### Example: Custom logging function {#custom-logging} #### Example: Custom logging function {#custom-logging}
During training, the results of each step are passed to a logger function in a During training, the results of each step are passed to a logger function. By
dictionary providing the following information: default, these results are written to the console with the
[`ConsoleLogger`](/api/top-level#ConsoleLogger). There is also built-in support
for writing the log files to [Weights & Biases](https://www.wandb.com/) with the
[`WandbLogger`](/api/top-level#WandbLogger). The logger function receives a
**dictionary** with the following keys:
| Key | Value | | Key | Value |
| -------------- | ---------------------------------------------------------------------------------------------- | | -------------- | ---------------------------------------------------------------------------------------------- |
@ -619,11 +623,17 @@ dictionary providing the following information:
| `losses` | The accumulated training losses, keyed by component name. ~~Dict[str, float]~~ | | `losses` | The accumulated training losses, keyed by component name. ~~Dict[str, float]~~ |
| `checkpoints` | A list of previous results, where each result is a (score, step, epoch) tuple. ~~List[Tuple]~~ | | `checkpoints` | A list of previous results, where each result is a (score, step, epoch) tuple. ~~List[Tuple]~~ |
By default, these results are written to the console with the You can easily implement and plug in your own logger that records the training
[`ConsoleLogger`](/api/top-level#ConsoleLogger). There is also built-in support results in a custom way, or sends them to an experiment management tracker of
for writing the log files to [Weights & Biases](https://www.wandb.com/) with your choice. In this example, the function `my_custom_logger.v1` writes the
the [`WandbLogger`](/api/top-level#WandbLogger). But you can easily implement tabular results to a file:
your own logger as well, for instance to write the tabular results to file:
> ```ini
> ### config.cfg (excerpt)
> [training.logger]
> @loggers = "my_custom_logger.v1"
> file_path = "my_file.tab"
> ```
```python ```python
### functions.py ### functions.py
@ -635,19 +645,19 @@ from pathlib import Path
def custom_logger(log_path): def custom_logger(log_path):
def setup_logger(nlp: "Language") -> Tuple[Callable, Callable]: def setup_logger(nlp: "Language") -> Tuple[Callable, Callable]:
with Path(log_path).open("w") as file_: with Path(log_path).open("w") as file_:
file_.write("step\t") file_.write("step\\t")
file_.write("score\t") file_.write("score\\t")
for pipe in nlp.pipe_names: for pipe in nlp.pipe_names:
file_.write(f"loss_{pipe}\t") file_.write(f"loss_{pipe}\\t")
file_.write("\n") file_.write("\\n")
def log_step(info: Dict[str, Any]): def log_step(info: Dict[str, Any]):
with Path(log_path).open("a") as file_: with Path(log_path).open("a") as file_:
file_.write(f"{info['step']}\t") file_.write(f"{info['step']}\\t")
file_.write(f"{info['score']}\t") file_.write(f"{info['score']}\\t")
for pipe in nlp.pipe_names: for pipe in nlp.pipe_names:
file_.write(f"{info['losses'][pipe]}\t") file_.write(f"{info['losses'][pipe]}\\t")
file_.write("\n") file_.write("\\n")
def finalize(): def finalize():
pass pass
@ -657,13 +667,6 @@ def custom_logger(log_path):
return setup_logger return setup_logger
``` ```
```ini
### config.cfg (excerpt)
[training.logger]
@loggers = "my_custom_logger.v1"
file_path = "my_file.tab"
```
#### Example: Custom batch size schedule {#custom-code-schedule} #### Example: Custom batch size schedule {#custom-code-schedule}
For example, let's say you've implemented your own batch size schedule to use For example, let's say you've implemented your own batch size schedule to use