mirror of
				https://github.com/explosion/spaCy.git
				synced 2025-10-31 16:07:41 +03:00 
			
		
		
		
	update REL model code
This commit is contained in:
		
							parent
							
								
									99d0412b6e
								
							
						
					
					
						commit
						124f49feb6
					
				|  | @ -540,7 +540,8 @@ code to create the ML model and the pipeline component from scratch. | ||||||
| It contains two config files to train the model:  | It contains two config files to train the model:  | ||||||
| one to run on CPU with a Tok2Vec layer, and one for the GPU using a transformer. | one to run on CPU with a Tok2Vec layer, and one for the GPU using a transformer. | ||||||
| The project applies the relation extraction component to identify biomolecular  | The project applies the relation extraction component to identify biomolecular  | ||||||
| interactions, but you can easily swap in your own dataset for your experiments. | interactions, but you can easily swap in your own dataset for your experiments | ||||||
|  | in any other domain. | ||||||
| </Project> | </Project> | ||||||
| 
 | 
 | ||||||
| #### Step 1: Implementing the Model {#component-rel-model} | #### Step 1: Implementing the Model {#component-rel-model} | ||||||
|  | @ -558,40 +559,17 @@ matrix** (~~Floats2d~~) of predictions: | ||||||
| 
 | 
 | ||||||
| ```python | ```python | ||||||
| ### Register the model architecture | ### Register the model architecture | ||||||
| @registry.architectures.register("rel_model.v1") | @spacy.registry.architectures.register("rel_model.v1") | ||||||
| def create_relation_model(...) -> Model[List[Doc], Floats2d]: | def create_relation_model(...) -> Model[List[Doc], Floats2d]: | ||||||
|     model = ...  # 👈 model will go here |     model = ...  # 👈 model will go here | ||||||
|     return model |     return model | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
| The first layer in this model will typically be an | We will adapt a **modular approach** to the definition of this relation model, and  | ||||||
| [embedding layer](/usage/embeddings-transformers) such as a | define it as chaining to layers together: the first layer that generates an  | ||||||
| [`Tok2Vec`](/api/tok2vec) component or a [`Transformer`](/api/transformer). This | instance tensor from a given set of documents, and the second layer that  | ||||||
| layer is assumed to be of type ~~Model[List[Doc], List[Floats2d]]~~ as it | transforms this tensor into a final tensor holding the predictions: | ||||||
| transforms each **document into a list of tokens**, with each token being |  | ||||||
| represented by its embedding in the vector space. |  | ||||||
| 
 | 
 | ||||||
| Next, we need a method that **generates pairs of entities** that we want to |  | ||||||
| classify as being related or not. As these candidate pairs are typically formed |  | ||||||
| within one document, this function takes a [`Doc`](/api/doc) as input and |  | ||||||
| outputs a `List` of `Span` tuples. For instance, a very straightforward |  | ||||||
| implementation would be to just take any two entities from the same document: |  | ||||||
| 
 |  | ||||||
| ```python |  | ||||||
| ### Simple candiate generation |  | ||||||
| def get_candidates(doc: Doc) -> List[Tuple[Span, Span]]: |  | ||||||
|     candidates = [] |  | ||||||
|     for ent1 in doc.ents: |  | ||||||
|         for ent2 in doc.ents: |  | ||||||
|             candidates.append((ent1, ent2)) |  | ||||||
|     return candidates |  | ||||||
| ``` |  | ||||||
| 
 |  | ||||||
| But we could also refine this further by **excluding relations** of an entity |  | ||||||
| with itself, and posing a **maximum distance** (in number of tokens) between two |  | ||||||
| entities. We register this function in the |  | ||||||
| [`@misc` registry](/api/top-level#registry) so we can refer to it from the |  | ||||||
| config, and easily swap it out for any other candidate generation function. |  | ||||||
| 
 | 
 | ||||||
| > #### config.cfg (excerpt) | > #### config.cfg (excerpt) | ||||||
| > | > | ||||||
|  | @ -599,17 +577,151 @@ config, and easily swap it out for any other candidate generation function. | ||||||
| > [model] | > [model] | ||||||
| > @architectures = "rel_model.v1" | > @architectures = "rel_model.v1" | ||||||
| > | > | ||||||
| > [model.tok2vec] | > [model.create_instance_tensor] | ||||||
| > # ... | > # ... | ||||||
| > | > | ||||||
| > [model.get_candidates] | > [model.classification_layer] | ||||||
| > @misc = "rel_cand_generator.v1" | > ... | ||||||
| > max_length = 20 |  | ||||||
| > ``` | > ``` | ||||||
| 
 | 
 | ||||||
| ```python | ```python | ||||||
| ### Extended candidate generation {highlight="1,2,7,8"} | ### Implement the model architecture | ||||||
| @registry.misc.register("rel_cand_generator.v1") | @spacy.registry.architectures.register("rel_model.v1") | ||||||
|  | def create_relation_model( | ||||||
|  |     create_instance_tensor: Model[List[Doc], Floats2d], | ||||||
|  |     classification_layer: Model[Floats2d, Floats2d], | ||||||
|  | ) -> Model[List[Doc], Floats2d]: | ||||||
|  |     model = chain(create_instance_tensor, classification_layer) | ||||||
|  |     return model | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | The `classification_layer` could be something simple like a Linear layer  | ||||||
|  | followed by a logistic activation function: | ||||||
|  | 
 | ||||||
|  | 
 | ||||||
|  | > #### config.cfg (excerpt) | ||||||
|  | > | ||||||
|  | > ```ini | ||||||
|  | > [model.classification_layer] | ||||||
|  | > @architectures = "rel_classification_layer.v1" | ||||||
|  | > nI = null | ||||||
|  | > nO = null | ||||||
|  | > ``` | ||||||
|  | 
 | ||||||
|  | ```python | ||||||
|  | ### Implement the classification layer | ||||||
|  | @spacy.registry.architectures.register("rel_classification_layer.v1") | ||||||
|  | def create_classification_layer( | ||||||
|  |     nO: int = None, nI: int = None | ||||||
|  | ) -> Model[Floats2d, Floats2d]: | ||||||
|  |     return chain(Linear(nO=nO, nI=nI), Logistic()) | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | The first layer that **creates the instance tensor** can be defined  | ||||||
|  | by implementing a  | ||||||
|  | [custom forward function](https://thinc.ai/docs/usage-models#weights-layers-forward)  | ||||||
|  | with an appropriate backpropagation callback. We also define an  | ||||||
|  | [initialization method](https://thinc.ai/docs/usage-models#weights-layers-init)  | ||||||
|  | that ensures that the layer is properly set up for training. | ||||||
|  | 
 | ||||||
|  | ```python | ||||||
|  | ### Implement the custom forward function | ||||||
|  | def instance_forward( | ||||||
|  |     model: Model[List[Doc], Floats2d],  | ||||||
|  |     docs: List[Doc],  | ||||||
|  |     is_train: bool | ||||||
|  | ) -> Tuple[Floats2d, Callable]: | ||||||
|  |     ... | ||||||
|  |     tok2vec = model.get_ref("tok2vec") | ||||||
|  |     tokvecs, bp_tokvecs = tok2vec(docs, is_train) | ||||||
|  |     relations = ...  | ||||||
|  | 
 | ||||||
|  |     def backprop(d_relations: Floats2d) -> List[Doc]: | ||||||
|  |         d_tokvecs = ... | ||||||
|  |         return bp_tokvecs(d_tokvecs) | ||||||
|  | 
 | ||||||
|  |     return relations, backprop | ||||||
|  | 
 | ||||||
|  | 
 | ||||||
|  | ### Implement the custom initialization method | ||||||
|  | def instance_init( | ||||||
|  |     model: Model,  | ||||||
|  |     X: List[Doc] = None,  | ||||||
|  |     Y: Floats2d = None | ||||||
|  | ) -> Model: | ||||||
|  |     tok2vec = model.get_ref("tok2vec") | ||||||
|  |     tok2vec.initialize(X) | ||||||
|  |     return model | ||||||
|  | 
 | ||||||
|  | 
 | ||||||
|  | ### Implement the layer that creates the instance tensor | ||||||
|  | @spacy.registry.architectures.register("rel_instance_tensor.v1") | ||||||
|  | def create_tensors( | ||||||
|  |     tok2vec: Model[List[Doc], List[Floats2d]], | ||||||
|  |     pooling: Model[Ragged, Floats2d], | ||||||
|  |     get_instances: Callable[[Doc], List[Tuple[Span, Span]]], | ||||||
|  | ) -> Model[List[Doc], Floats2d]: | ||||||
|  | 
 | ||||||
|  |     return Model( | ||||||
|  |         "instance_tensors", | ||||||
|  |         instance_forward, | ||||||
|  |         layers=[tok2vec, pooling], | ||||||
|  |         refs={"tok2vec": tok2vec, "pooling": pooling}, | ||||||
|  |         attrs={"get_instances": get_instances}, | ||||||
|  |         init=instance_init, | ||||||
|  |     ) | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | > #### config.cfg (excerpt) | ||||||
|  | > | ||||||
|  | > ```ini | ||||||
|  | > [model.create_instance_tensor] | ||||||
|  | > @architectures = "rel_instance_tensor.v1" | ||||||
|  | > | ||||||
|  | > [model.create_instance_tensor.tok2vec] | ||||||
|  | > @architectures = "spacy.HashEmbedCNN.v1" | ||||||
|  | > ... | ||||||
|  | > | ||||||
|  | > [model.create_instance_tensor.pooling] | ||||||
|  | > @layers = "reduce_mean.v1" | ||||||
|  | > | ||||||
|  | > [model.create_instance_tensor.get_instances] | ||||||
|  | > ... | ||||||
|  | > ` | ||||||
|  | 
 | ||||||
|  | This custom layer uses an | ||||||
|  | **[embedding layer](/usage/embeddings-transformers)** such as a | ||||||
|  | [`Tok2Vec`](/api/tok2vec) component or a [`Transformer`](/api/transformer). This | ||||||
|  | layer is assumed to be of type ~~Model[List[Doc], List[Floats2d]]~~ as it | ||||||
|  | transforms each **document into a list of tokens**, with each token being | ||||||
|  | represented by its embedding in the vector space.  | ||||||
|  | 
 | ||||||
|  | The **`pooling`** layer will be applied to summarize the token vectors into entity  | ||||||
|  | vectors, as named entities (represented by `Span` objects) can consist of one  | ||||||
|  | or multiple tokens. For instance, the pooling layer could resort to calculating  | ||||||
|  | the average of all token vectors in an entity. Thinc provides several  | ||||||
|  | [built-in pooling operators](https://thinc.ai/docs/api-layers#reduction-ops) for  | ||||||
|  | this purpose. | ||||||
|  | 
 | ||||||
|  | > #### config.cfg (excerpt) | ||||||
|  | > | ||||||
|  | > ```ini | ||||||
|  | > | ||||||
|  | > [model.create_instance_tensor.get_instances] | ||||||
|  | > @misc = "rel_instance_generator.v1" | ||||||
|  | > max_length = 100 | ||||||
|  | > ``` | ||||||
|  | 
 | ||||||
|  | Finally, we need a `get_instances` method that **generates pairs of entities**  | ||||||
|  | that we want to classify as being related or not. As these candidate pairs are typically formed | ||||||
|  | within one document, this function takes a [`Doc`](/api/doc) as input and | ||||||
|  | outputs a `List` of `Span` tuples. For instance, this | ||||||
|  | implementation takes any two entities from the same document, as long as they | ||||||
|  | are within a **maximum distance** (in number of tokens) of eachother: | ||||||
|  | 
 | ||||||
|  | ```python | ||||||
|  | ### Simple candiate generation | ||||||
|  | @spacy.registry.misc.register("rel_instance_generator.v1") | ||||||
| def create_candidate_indices(max_length: int) -> Callable[[Doc], List[Tuple[Span, Span]]]: | def create_candidate_indices(max_length: int) -> Callable[[Doc], List[Tuple[Span, Span]]]: | ||||||
|     def get_candidates(doc: "Doc") -> List[Tuple[Span, Span]]: |     def get_candidates(doc: "Doc") -> List[Tuple[Span, Span]]: | ||||||
|         candidates = [] |         candidates = [] | ||||||
|  | @ -621,46 +733,19 @@ def create_candidate_indices(max_length: int) -> Callable[[Doc], List[Tuple[Span | ||||||
|         return candidates |         return candidates | ||||||
|     return get_candidates |     return get_candidates | ||||||
| ``` | ``` | ||||||
|  | This function in added to the | ||||||
|  | [`@misc` registry](/api/top-level#registry) so we can refer to it from the | ||||||
|  | config, and easily swap it out for any other candidate generation function. | ||||||
| 
 | 
 | ||||||
| Finally, we require a method that transforms the candidate entity pairs into a |  | ||||||
| 2D tensor using the specified [`Tok2Vec`](/api/tok2vec) or |  | ||||||
| [`Transformer`](/api/transformer). The resulting ~~Floats2~~ object will then be |  | ||||||
| processed by a final `output_layer` of the network. Putting all this together, |  | ||||||
| we can define our relation model in a config file as such: |  | ||||||
| 
 |  | ||||||
| ```ini |  | ||||||
| ### config.cfg |  | ||||||
| [model] |  | ||||||
| @architectures = "rel_model.v1" |  | ||||||
| # ... |  | ||||||
| 
 |  | ||||||
| [model.tok2vec] |  | ||||||
| # ... |  | ||||||
| 
 |  | ||||||
| [model.get_candidates] |  | ||||||
| @misc = "rel_cand_generator.v1" |  | ||||||
| max_length = 20 |  | ||||||
| 
 |  | ||||||
| [model.create_candidate_tensor] |  | ||||||
| @misc = "rel_cand_tensor.v1" |  | ||||||
| 
 |  | ||||||
| [model.output_layer] |  | ||||||
| @architectures = "rel_output_layer.v1" |  | ||||||
| # ... |  | ||||||
| ``` |  | ||||||
| 
 |  | ||||||
| <!-- TODO: link to project for implementation details --> |  | ||||||
| <!-- TODO: maybe embed files from project that show the architectures? --> |  | ||||||
| 
 | 
 | ||||||
| When creating this model, we store the custom functions as | When creating this model, we store the custom functions as | ||||||
| [attributes](https://thinc.ai/docs/api-model#properties) and the sublayers as | [attributes](https://thinc.ai/docs/api-model#properties) and the sublayers as | ||||||
| references, so we can access them easily: | references, so we can access them easily: | ||||||
| 
 | 
 | ||||||
| ```python | ```python | ||||||
| tok2vec_layer = model.get_ref("tok2vec") | pooling = model.get_ref("pooling") | ||||||
| output_layer = model.get_ref("output_layer") | tok2vec = model.get_ref("tok2vec") | ||||||
| create_candidate_tensor = model.attrs["create_candidate_tensor"] | get_instances = model.attrs["get_instances"] | ||||||
| get_candidates = model.attrs["get_candidates"] |  | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
| #### Step 2: Implementing the pipeline component {#component-rel-pipe} | #### Step 2: Implementing the pipeline component {#component-rel-pipe} | ||||||
|  | @ -935,5 +1020,6 @@ code to create the ML model and the pipeline component from scratch. | ||||||
| It contains two config files to train the model:  | It contains two config files to train the model:  | ||||||
| one to run on CPU with a Tok2Vec layer, and one for the GPU using a transformer. | one to run on CPU with a Tok2Vec layer, and one for the GPU using a transformer. | ||||||
| The project applies the relation extraction component to identify biomolecular  | The project applies the relation extraction component to identify biomolecular  | ||||||
| interactions, but you can easily swap in your own dataset for your experiments. | interactions, but you can easily swap in your own dataset for your experiments  | ||||||
|  | in any other domain. | ||||||
| </Project> | </Project> | ||||||
|  |  | ||||||
		Loading…
	
		Reference in New Issue
	
	Block a user