Update docs [ci skip]

This commit is contained in:
Ines Montani 2020-08-31 16:39:53 +02:00
parent 97ffb4ed05
commit bca6bf8dda
3 changed files with 38 additions and 30 deletions

View File

@ -362,8 +362,8 @@ loss and the accuracy scores on the development set.
There are two built-in logging functions: a logger printing results to the
console in tabular format (which is the default), and one that also sends the
results to a [Weights & Biases](https://www.wandb.com/) dashboard.
Instead of using one of the built-in loggers listed here, you can also
results to a [Weights & Biases](https://www.wandb.com/) dashboard. Instead of
using one of the built-in loggers listed here, you can also
[implement your own](/usage/training#custom-logging).
> #### Example config
@ -394,11 +394,16 @@ memory utilization, network traffic, disk IO, GPU statistics, etc. This will
also include information such as your hostname and operating system, as well as
the location of your Python executable.
Note that by default, the full (interpolated) training config file is sent over
to the W&B dashboard. If you prefer to exclude certain information such as path
names, you can list those fields in "dot notation" in the `remove_config_values`
parameter. These fields will then be removed from the config before uploading,
but will otherwise remain in the config file stored on your local system.
<Infobox variant="warning">
Note that by default, the full (interpolated)
[training config](/usage/training#config) is sent over to the W&B dashboard. If
you prefer to **exclude certain information** such as path names, you can list
those fields in "dot notation" in the `remove_config_values` parameter. These
fields will then be removed from the config before uploading, but will otherwise
remain in the config file stored on your local system.
</Infobox>
> #### Example config
>

View File

@ -914,4 +914,4 @@ mattis pretium.
### Weights & Biases {#wandb} <IntegrationLogo name="wandb" width={175} height="auto" align="right" />
<!-- TODO: decide how we want this to work? Just send results plus config from spacy evaluate in a separate command/script? -->
<!-- TODO: link to WandB logger, explain that it's built-in but that you can also do other cool stuff with WandB? And then include example project (still need to decide what we want to do here) -->

View File

@ -607,8 +607,12 @@ $ python -m spacy train config.cfg --output ./output --code ./functions.py
#### Example: Custom logging function {#custom-logging}
During training, the results of each step are passed to a logger function in a
dictionary providing the following information:
During training, the results of each step are passed to a logger function. By
default, these results are written to the console with the
[`ConsoleLogger`](/api/top-level#ConsoleLogger). There is also built-in support
for writing the log files to [Weights & Biases](https://www.wandb.com/) with the
[`WandbLogger`](/api/top-level#WandbLogger). The logger function receives a
**dictionary** with the following keys:
| Key | Value |
| -------------- | ---------------------------------------------------------------------------------------------- |
@ -619,11 +623,17 @@ dictionary providing the following information:
| `losses` | The accumulated training losses, keyed by component name. ~~Dict[str, float]~~ |
| `checkpoints` | A list of previous results, where each result is a (score, step, epoch) tuple. ~~List[Tuple]~~ |
By default, these results are written to the console with the
[`ConsoleLogger`](/api/top-level#ConsoleLogger). There is also built-in support
for writing the log files to [Weights & Biases](https://www.wandb.com/) with
the [`WandbLogger`](/api/top-level#WandbLogger). But you can easily implement
your own logger as well, for instance to write the tabular results to file:
You can easily implement and plug in your own logger that records the training
results in a custom way, or sends them to an experiment management tracker of
your choice. In this example, the function `my_custom_logger.v1` writes the
tabular results to a file:
> ```ini
> ### config.cfg (excerpt)
> [training.logger]
> @loggers = "my_custom_logger.v1"
> file_path = "my_file.tab"
> ```
```python
### functions.py
@ -635,19 +645,19 @@ from pathlib import Path
def custom_logger(log_path):
def setup_logger(nlp: "Language") -> Tuple[Callable, Callable]:
with Path(log_path).open("w") as file_:
file_.write("step\t")
file_.write("score\t")
file_.write("step\\t")
file_.write("score\\t")
for pipe in nlp.pipe_names:
file_.write(f"loss_{pipe}\t")
file_.write("\n")
file_.write(f"loss_{pipe}\\t")
file_.write("\\n")
def log_step(info: Dict[str, Any]):
with Path(log_path).open("a") as file_:
file_.write(f"{info['step']}\t")
file_.write(f"{info['score']}\t")
file_.write(f"{info['step']}\\t")
file_.write(f"{info['score']}\\t")
for pipe in nlp.pipe_names:
file_.write(f"{info['losses'][pipe]}\t")
file_.write("\n")
file_.write(f"{info['losses'][pipe]}\\t")
file_.write("\\n")
def finalize():
pass
@ -657,13 +667,6 @@ def custom_logger(log_path):
return setup_logger
```
```ini
### config.cfg (excerpt)
[training.logger]
@loggers = "my_custom_logger.v1"
file_path = "my_file.tab"
```
#### Example: Custom batch size schedule {#custom-code-schedule}
For example, let's say you've implemented your own batch size schedule to use