spaCy/spacy/ml/models
Connor Brinton 6dd56868de
📝 Fix formula for receptive field in docs (#12918)
SpaCy's HashEmbedCNN layer performs convolutions over tokens to produce
contextualized embeddings using a `MaxoutWindowEncoder` layer. These
convolutions are implemented using Thinc's `expand_window` layer, which
concatenates `window_size` neighboring sequence items on either side of
the sequence item being processed. This is repeated across `depth`
convolutional layers.

For example, consider the sequence "ABCDE" and a `MaxoutWindowEncoder`
layer with a context window of 1 and a depth of 2. We'll focus on the
token "C". We can visually represent the contextual embedding produced
for "C" as:
```mermaid
flowchart LR
A0(A<sub>0</sub>)
B0(B<sub>0</sub>)
C0(C<sub>0</sub>)
D0(D<sub>0</sub>)
E0(E<sub>0</sub>)
B1(B<sub>1</sub>)
C1(C<sub>1</sub>)
D1(D<sub>1</sub>)
C2(C<sub>2</sub>)
A0 --> B1
B0 --> B1
C0 --> B1
B0 --> C1
C0 --> C1
D0 --> C1
C0 --> D1
D0 --> D1
E0 --> D1
B1 --> C2
C1 --> C2
D1 --> C2
```

Described in words, this graph shows that before the first layer of the
convolution, the "receptive field" centered at each token consists only
of that same token. That is to say, that we have a receptive field of 1.
The first layer of the convolution adds one neighboring token on either
side to the receptive field. Since this is done on both sides, the
receptive field increases by 2, giving the first layer a receptive field
of 3. The second layer of the convolutions adds an _additional_
neighboring token on either side to the receptive field, giving a final
receptive field of 5.

However, this doesn't match the formula currently given in the docs,
which read:
> The receptive field of the CNN will be
> `depth * (window_size * 2 + 1)`, so a 4-layer network with a window
> size of `2` will be sensitive to 20 words at a time.

Substituting in our depth of 2 and window size of 1, this formula gives
us a receptive field of:
```
depth * (window_size * 2 + 1)
= 2 * (1 * 2 + 1)
= 2 * (2 + 1)
= 2 * 3
= 6
```

This not only doesn't match our computations from above, it's also an
even number! This is suspicious, since the receptive field is supposed
to be centered on a token, and not between tokens. Generally, this
formula results in an even number for any even value of `depth`.

The error in this formula is that the adjustment for the center token
is multiplied by the depth, when it should occur only once. The
corrected formula, `depth * window_size * 2 + 1`, gives the correct
value for our small example from above:
```
depth * window_size * 2 + 1
= 2 * 1 * 2 + 1
= 4 + 1
= 5
```

These changes update the docs to correct the receptive field formula and
the example receptive field size.
2023-08-21 10:52:32 +02:00
..
__init__.py SpanFinder into spaCy from experimental (#12507) 2023-06-07 15:52:28 +02:00
entity_linker.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
multi_task.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
parser.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
span_finder.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
spancat.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
tagger.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
textcat.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
tok2vec.py 📝 Fix formula for receptive field in docs (#12918) 2023-08-21 10:52:32 +02:00