mirror of
https://github.com/explosion/spaCy.git
synced 2024-11-11 12:18:04 +03:00
20 lines
1.0 KiB
Plaintext
20 lines
1.0 KiB
Plaintext
|
//- 💫 DOCS > API > TEXTCATEGORIZER
|
||
|
|
||
|
include ../_includes/_mixins
|
||
|
|
||
|
p
|
||
|
| The model supports classification with multiple, non-mutually exclusive
|
||
|
| labels. You can change the model architecture rather easily, but by
|
||
|
| default, the #[code TextCategorizer] class uses a convolutional
|
||
|
| neural network to assign position-sensitive vectors to each word in the
|
||
|
| document. This step is similar to the #[+api("tensorizer") #[code Tensorizer]]
|
||
|
| component, but the #[code TextCategorizer] uses its own CNN model, to
|
||
|
| avoid sharing weights with the other pipeline components. The document
|
||
|
| tensor is then
|
||
|
| summarized by concatenating max and mean pooling, and a multilayer
|
||
|
| perceptron is used to predict an output vector of length #[code nr_class],
|
||
|
| before a logistic activation is applied elementwise. The value of each
|
||
|
| output neuron is the probability that some class is present.
|
||
|
|
||
|
!=partial("pipe", { subclass: "TextCategorizer", short: "textcat", pipeline_id: "textcat" })
|