Merge branch 'master' of ssh://github.com/explosion/spaCy

This commit is contained in:
Matthew Honnibal 2016-11-01 12:26:15 +01:00
commit 41b7014c70
2 changed files with 24 additions and 23 deletions

View File

@ -1,14 +1,14 @@
<a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a> <a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a>
# A Decomposable Attention Model for Natural Language Inference # A decomposable attention model for Natural Language Inference
**by Matthew Honnibal, [@honnibal](https://github.com/honnibal)** **by Matthew Honnibal, [@honnibal](https://github.com/honnibal)**
This directory contains an implementation of entailment prediction model described This directory contains an implementation of the entailment prediction model described
by [Parikh et al. (2016)](https://arxiv.org/pdf/1606.01933.pdf). The model is notable by [Parikh et al. (2016)](https://arxiv.org/pdf/1606.01933.pdf). The model is notable
for its competitive performance with very few parameters. for its competitive performance with very few parameters.
The model is implemented using [Keras](https://keras.io/) and [spaCy](https://spacy.io). The model is implemented using [Keras](https://keras.io/) and [spaCy](https://spacy.io).
Keras is used to build and train the network, while spaCy is used to load Keras is used to build and train the network. spaCy is used to load
the [GloVe](http://nlp.stanford.edu/projects/glove/) vectors, perform the the [GloVe](http://nlp.stanford.edu/projects/glove/) vectors, perform the
feature extraction, and help you apply the model at run-time. The following feature extraction, and help you apply the model at run-time. The following
demo code shows how the entailment model can be used at runtime, once the demo code shows how the entailment model can be used at runtime, once the
@ -35,20 +35,16 @@ lots of ways to extend the model.
## What's where ## What's where
* `keras_parikh_entailment/__main__.py`: The script that will be executed. | File | Description |
Defines the CLI, the data reading, etc — all the boring stuff. | --- | --- |
| `__main__.py` | The script that will be executed. Defines the CLI, the data reading, etc — all the boring stuff. |
* `keras_parikh_entailment/spacy_hook.py`: Provides a class `SimilarityShim` | `spacy_hook.py` | Provides a class `SimilarityShim` that lets you use an arbitrary function to customize spaCy's `doc.similarity()` method. Instead of the default average-of-vectors algorithm, when you call `doc1.similarity(doc2)`, you'll get the result of `your_model(doc1, doc2)`. |
that lets you use an arbitrary function to customize spaCy's | `keras_decomposable_attention.py` | Defines the neural network model. |
`doc.similarity()` method. Instead of the default average-of-vectors algorithm,
when you call `doc1.similarity(doc2)`, you'll get the result of `your_model(doc1, doc2)`.
* `keras_parikh_entailment/keras_decomposable_attention.py`: Defines the neural network model.
This part knows nothing of spaCy --- its ordinary Keras usage.
## Setting up ## Setting up
First, install keras, spaCy and the spaCy English models (about 1GB of data): First, install [Keras](https://keras.io/), [spaCy](https://spacy.io) and the spaCy
English models (about 1GB of data):
```bash ```bash
pip install keras spacy pip install keras spacy
@ -56,11 +52,11 @@ python -m spacy.en.download
``` ```
You'll also want to get keras working on your GPU. This will depend on your You'll also want to get keras working on your GPU. This will depend on your
set up, so you're mostly on your own for this step. If you're using AWS, try the NVidia set up, so you're mostly on your own for this step. If you're using AWS, try the
AMI. It made things pretty easy. [NVidia AMI](https://aws.amazon.com/marketplace/pp/B00FYCDDTE). It made things pretty easy.
Once you've installed the dependencies, you can run a small preliminary test of Once you've installed the dependencies, you can run a small preliminary test of
the keras model: the Keras model:
```bash ```bash
py.test keras_parikh_entailment/keras_decomposable_attention.py py.test keras_parikh_entailment/keras_decomposable_attention.py
@ -69,14 +65,12 @@ py.test keras_parikh_entailment/keras_decomposable_attention.py
This compiles the model and fits it with some dummy data. You should see that This compiles the model and fits it with some dummy data. You should see that
both tests passed. both tests passed.
Finally, download the Stanford Natural Language Inference corpus. Finally, download the [Stanford Natural Language Inference corpus](http://nlp.stanford.edu/projects/snli/).
Source: http://nlp.stanford.edu/projects/snli/
## Running the example ## Running the example
You can run the `keras_parikh_entailment/` directory as a script, which executes the file You can run the `keras_parikh_entailment/` directory as a script, which executes the file
`keras_parikh_entailment/__main__.py`. The first thing you'll want to do is train the model: [`keras_parikh_entailment/__main__.py`](__main__.py). The first thing you'll want to do is train the model:
```bash ```bash
python keras_parikh_entailment/ train <your_model_dir> <train_directory> <dev_directory> python keras_parikh_entailment/ train <your_model_dir> <train_directory> <dev_directory>
@ -95,4 +89,5 @@ you how run-time usage will eventually look.
## Getting updates ## Getting updates
We should have the blog post explaining the model ready before the end of the week. To get We should have the blog post explaining the model ready before the end of the week. To get
notified when it's published, you can either the follow me on Twitter, or subscribe to our mailing list. notified when it's published, you can either the follow me on [Twitter](https://twitter.com/honnibal),
or subscribe to our [mailing list](http://eepurl.com/ckUpQ5).

View File

@ -214,6 +214,12 @@
"author": "Matthew Honnibal", "author": "Matthew Honnibal",
"tags": [ "keras", "sentiment" ] "tags": [ "keras", "sentiment" ]
}, },
"A decomposable attention model for Natural Language Inference": {
"url": "https://github.com/explosion/spaCy/tree/master/examples/keras_parikh_entailment",
"author": "Matthew Honnibal",
"tags": [ "keras", "similarity" ]
},
"Using the German model": { "Using the German model": {
"url": "https://explosion.ai/blog/german-model", "url": "https://explosion.ai/blog/german-model",
"author": "Wolfgang Seeker", "author": "Wolfgang Seeker",
@ -263,7 +269,7 @@
"tags": [ "big data" ] "tags": [ "big data" ]
}, },
"Inventory count": { "Inventory count": {
"url": "https://github.com/explosion/spaCy/tree/master/examples/InventoryCount", "url": "https://github.com/explosion/spaCy/tree/master/examples/inventory_count",
"author": "Oleg Zd" "author": "Oleg Zd"
}, },
"Multi-word matches": { "Multi-word matches": {