mirror of
https://github.com/explosion/spaCy.git
synced 2025-07-10 16:22:29 +03:00
Merge branch 'master' of ssh://github.com/explosion/spaCy
This commit is contained in:
commit
18aab4f71e
|
@ -1,77 +1,98 @@
|
||||||
|
<a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a>
|
||||||
|
|
||||||
# A Decomposable Attention Model for Natural Language Inference
|
# A Decomposable Attention Model for Natural Language Inference
|
||||||
|
**by Matthew Honnibal, [@honnibal](https://github.com/honnibal)**
|
||||||
|
|
||||||
This directory contains an implementation of entailment prediction model described
|
This directory contains an implementation of entailment prediction model described
|
||||||
by Parikh et al. (2016). The model is notable for its competitive performance
|
by [Parikh et al. (2016)](https://arxiv.org/pdf/1606.01933.pdf). The model is notable
|
||||||
with very few parameters.
|
for its competitive performance with very few parameters.
|
||||||
|
|
||||||
https://arxiv.org/pdf/1606.01933.pdf
|
The model is implemented using [Keras](https://keras.io/) and [spaCy](https://spacy.io).
|
||||||
|
Keras is used to build and train the network, while spaCy is used to load
|
||||||
The model is implemented using Keras and spaCy. Keras is used to build and
|
the [GloVe](http://nlp.stanford.edu/projects/glove/) vectors, perform the
|
||||||
train the network, while spaCy is used to load the GloVe vectors, perform the
|
|
||||||
feature extraction, and help you apply the model at run-time. The following
|
feature extraction, and help you apply the model at run-time. The following
|
||||||
demo code shows how the entailment model can be used at runtime, once the hook is
|
demo code shows how the entailment model can be used at runtime, once the
|
||||||
installed to customise the `.similarity()` method of spaCy's `Doc` and `Span`
|
hook is installed to customise the `.similarity()` method of spaCy's `Doc`
|
||||||
objects:
|
and `Span` objects:
|
||||||
|
|
||||||
def demo(model_dir):
|
```python
|
||||||
nlp = spacy.load('en', path=model_dir,
|
def demo(model_dir):
|
||||||
create_pipeline=create_similarity_pipeline)
|
nlp = spacy.load('en', path=model_dir,
|
||||||
doc1 = nlp(u'Worst fries ever! Greasy and horrible...')
|
create_pipeline=create_similarity_pipeline)
|
||||||
doc2 = nlp(u'The milkshakes are good. The fries are bad.')
|
doc1 = nlp(u'Worst fries ever! Greasy and horrible...')
|
||||||
print(doc1.similarity(doc2))
|
doc2 = nlp(u'The milkshakes are good. The fries are bad.')
|
||||||
sent1a, sent1b = doc1.sents
|
print(doc1.similarity(doc2))
|
||||||
print(sent1a.similarity(sent1b))
|
sent1a, sent1b = doc1.sents
|
||||||
print(sent1a.similarity(doc2))
|
print(sent1a.similarity(sent1b))
|
||||||
print(sent1b.similarity(doc2))
|
print(sent1a.similarity(doc2))
|
||||||
|
print(sent1b.similarity(doc2))
|
||||||
|
```
|
||||||
|
|
||||||
I'm working on a blog post to explain Parikh et al.'s model in more detail.
|
I'm working on a blog post to explain Parikh et al.'s model in more detail.
|
||||||
I think it is a very interesting example of the attention mechanism, which
|
I think it is a very interesting example of the attention mechanism, which
|
||||||
I didn't understand very well before working through this paper.
|
I didn't understand very well before working through this paper. There are
|
||||||
|
lots of ways to extend the model.
|
||||||
|
|
||||||
# How to run the example
|
## What's where
|
||||||
|
|
||||||
1. Install spaCy and its English models (about 1GB of data):
|
* `keras_parikh_entailment/__main__.py`: The script that will be executed.
|
||||||
|
Defines the CLI, the data reading, etc — all the boring stuff.
|
||||||
|
|
||||||
pip install spacy
|
* `keras_parikh_entailment/spacy_hook.py`: Provides a class `SimilarityShim`
|
||||||
python -m spacy.en.download
|
that lets you use an arbitrary function to customize spaCy's
|
||||||
|
`doc.similarity()` method. Instead of the default average-of-vectors algorithm,
|
||||||
|
when you call `doc1.similarity(doc2)`, you'll get the result of `your_model(doc1, doc2)`.
|
||||||
|
|
||||||
This will give you the spaCy's tokenization, tagging, NER and parsing models,
|
* `keras_parikh_entailment/keras_decomposable_attention.py`: Defines the neural network model.
|
||||||
as well as the GloVe word vectors.
|
This part knows nothing of spaCy --- its ordinary Keras usage.
|
||||||
|
|
||||||
2. Install Keras
|
## Setting up
|
||||||
|
|
||||||
pip install keras
|
First, install keras, spaCy and the spaCy English models (about 1GB of data):
|
||||||
|
|
||||||
3. Get Keras working with your GPU
|
```bash
|
||||||
|
pip install keras spacy
|
||||||
|
python -m spacy.en.download
|
||||||
|
```
|
||||||
|
|
||||||
You're mostly on your own here. My only advice is, if you're setting up on AWS,
|
You'll also want to get keras working on your GPU. This will depend on your
|
||||||
try using the AMI published by NVidia. With the image, getting everything set
|
set up, so you're mostly on your own for this step. If you're using AWS, try the NVidia
|
||||||
up wasn't *too* painful.
|
AMI. It made things pretty easy.
|
||||||
|
|
||||||
4. Test the Keras model:
|
Once you've installed the dependencies, you can run a small preliminary test of
|
||||||
|
the keras model:
|
||||||
|
|
||||||
py.test nli/keras_decomposable_attention.py
|
```bash
|
||||||
|
py.test keras_parikh_entailment/keras_decomposable_attention.py
|
||||||
|
```
|
||||||
|
|
||||||
This should tell you that two tests passed.
|
This compiles the model and fits it with some dummy data. You should see that
|
||||||
|
both tests passed.
|
||||||
|
|
||||||
5. Download the Stanford Natural Language Inference data
|
Finally, download the Stanford Natural Language Inference corpus.
|
||||||
|
|
||||||
http://nlp.stanford.edu/projects/snli/
|
Source: http://nlp.stanford.edu/projects/snli/
|
||||||
|
|
||||||
6. Train the model:
|
## Running the example
|
||||||
|
|
||||||
python nli/ train <your_model_dir> <train_directory> <dev_directory>
|
You can run the `keras_parikh_entailment/` directory as a script, which executes the file
|
||||||
|
`keras_parikh_entailment/__main__.py`. The first thing you'll want to do is train the model:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python keras_parikh_entailment/ train <your_model_dir> <train_directory> <dev_directory>
|
||||||
|
```
|
||||||
|
|
||||||
Training takes about 300 epochs for full accuracy, and I haven't rerun the full
|
Training takes about 300 epochs for full accuracy, and I haven't rerun the full
|
||||||
experiment since refactoring things to publish this example --- please let me
|
experiment since refactoring things to publish this example — please let me
|
||||||
know if I've broken something.
|
know if I've broken something. You should get to at least 85% on the development data.
|
||||||
|
|
||||||
You should get to at least 85% on the development data.
|
The other two modes demonstrate run-time usage. I never like relying on the accuracy printed
|
||||||
|
by `.fit()` methods. I never really feel confident until I've run a new process that loads
|
||||||
|
the model and starts making predictions, without access to the gold labels. I've therefore
|
||||||
|
included an `evaluate` mode. Finally, there's also a little demo, which mostly exists to show
|
||||||
|
you how run-time usage will eventually look.
|
||||||
|
|
||||||
7. Evaluate the model (optional):
|
## Getting updates
|
||||||
|
|
||||||
python nli/ evaluate <your_model_dir> <dev_directory>
|
We should have the blog post explaining the model ready before the end of the week. To get
|
||||||
|
notified when it's published, you can either the follow me on Twitter, or subscribe to our mailing list.
|
||||||
8. Run the demo (optional):
|
|
||||||
|
|
||||||
python nli/ demo <your_model_dir>
|
|
||||||
|
|
|
@ -27,7 +27,7 @@ The docs can always use another example or more detail, and they should always b
|
||||||
|
|
||||||
While all page content lives in the `.jade` files, article meta (page titles, sidebars etc.) is stored as JSON. Each folder contains a `_data.json` with all required meta for its files.
|
While all page content lives in the `.jade` files, article meta (page titles, sidebars etc.) is stored as JSON. Each folder contains a `_data.json` with all required meta for its files.
|
||||||
|
|
||||||
For simplicity, all sites linked in the [tutorials](https://spacy.io/docs/usage/tutorials) and [showcase](https://spacy.io/docs/usage/showcase) are also stored as JSON. So in order to edit those pages, there's no need to dig into the Jade files – simply edit the [`_data.json`](website/docs/usage/_data.json).
|
For simplicity, all sites linked in the [tutorials](https://spacy.io/docs/usage/tutorials) and [showcase](https://spacy.io/docs/usage/showcase) are also stored as JSON. So in order to edit those pages, there's no need to dig into the Jade files – simply edit the [`_data.json`](docs/usage/_data.json).
|
||||||
|
|
||||||
### Markup language and conventions
|
### Markup language and conventions
|
||||||
|
|
||||||
|
@ -54,7 +54,7 @@ Note that for external links, `+a("...")` is used instead of `a(href="...")` –
|
||||||
|
|
||||||
### Mixins
|
### Mixins
|
||||||
|
|
||||||
Each file includes a collection of [custom mixins](website/_includes/_mixins.jade) that make it easier to add content components – no HTML or class names required.
|
Each file includes a collection of [custom mixins](_includes/_mixins.jade) that make it easier to add content components – no HTML or class names required.
|
||||||
|
|
||||||
For example:
|
For example:
|
||||||
```pug
|
```pug
|
||||||
|
@ -89,7 +89,7 @@ Code blocks are implemented using the `+code` or `+aside-code` (to display them
|
||||||
en_doc = en_nlp(u'Hello, world. Here are two sentences.')
|
en_doc = en_nlp(u'Hello, world. Here are two sentences.')
|
||||||
```
|
```
|
||||||
|
|
||||||
You can find the documentation for the available mixins in [`_includes/_mixins.jade`](website/_includes/_mixins.jade).
|
You can find the documentation for the available mixins in [`_includes/_mixins.jade`](_includes/_mixins.jade).
|
||||||
|
|
||||||
### Linking to the Github repo
|
### Linking to the Github repo
|
||||||
|
|
||||||
|
|
|
@ -11,7 +11,6 @@
|
||||||
"COMPANY": "Explosion AI",
|
"COMPANY": "Explosion AI",
|
||||||
"COMPANY_URL": "https://explosion.ai",
|
"COMPANY_URL": "https://explosion.ai",
|
||||||
"DEMOS_URL": "https://demos.explosion.ai",
|
"DEMOS_URL": "https://demos.explosion.ai",
|
||||||
|
|
||||||
"SPACY_VERSION": "1.1",
|
"SPACY_VERSION": "1.1",
|
||||||
|
|
||||||
"SOCIAL": {
|
"SOCIAL": {
|
||||||
|
@ -20,15 +19,6 @@
|
||||||
"reddit": "spacynlp"
|
"reddit": "spacynlp"
|
||||||
},
|
},
|
||||||
|
|
||||||
"SCRIPTS" : [ "main", "prism" ],
|
|
||||||
"DEFAULT_SYNTAX" : "python",
|
|
||||||
"ANALYTICS": "UA-58931649-1",
|
|
||||||
"MAILCHIMP": {
|
|
||||||
"user": "spacy.us12",
|
|
||||||
"id": "83b0498b1e7fa3c91ce68c3f1",
|
|
||||||
"list": "89ad33e698"
|
|
||||||
},
|
|
||||||
|
|
||||||
"NAVIGATION": {
|
"NAVIGATION": {
|
||||||
"Home": "/",
|
"Home": "/",
|
||||||
"Docs": "/docs",
|
"Docs": "/docs",
|
||||||
|
@ -55,6 +45,16 @@
|
||||||
"Blog": "https://explosion.ai/blog",
|
"Blog": "https://explosion.ai/blog",
|
||||||
"Contact": "mailto:contact@explosion.ai"
|
"Contact": "mailto:contact@explosion.ai"
|
||||||
}
|
}
|
||||||
|
},
|
||||||
|
|
||||||
|
"V_CSS": "1.4",
|
||||||
|
"V_JS": "1.0",
|
||||||
|
"DEFAULT_SYNTAX" : "python",
|
||||||
|
"ANALYTICS": "UA-58931649-1",
|
||||||
|
"MAILCHIMP": {
|
||||||
|
"user": "spacy.us12",
|
||||||
|
"id": "83b0498b1e7fa3c91ce68c3f1",
|
||||||
|
"list": "89ad33e698"
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
|
@ -37,10 +37,10 @@ html(lang="en")
|
||||||
link(rel="icon" type="image/x-icon" href="/assets/img/favicon.ico")
|
link(rel="icon" type="image/x-icon" href="/assets/img/favicon.ico")
|
||||||
|
|
||||||
if SUBSECTION == "usage"
|
if SUBSECTION == "usage"
|
||||||
link(href="/assets/css/style_red.css?v1" rel="stylesheet")
|
link(href="/assets/css/style_red.css?v#{V_CSS}" rel="stylesheet")
|
||||||
|
|
||||||
else
|
else
|
||||||
link(href="/assets/css/style.css?v1" rel="stylesheet")
|
link(href="/assets/css/style.css?v#{V_CSS}" rel="stylesheet")
|
||||||
|
|
||||||
body
|
body
|
||||||
include _includes/_navigation
|
include _includes/_navigation
|
||||||
|
@ -52,8 +52,8 @@ html(lang="en")
|
||||||
main!=yield
|
main!=yield
|
||||||
include _includes/_footer
|
include _includes/_footer
|
||||||
|
|
||||||
each script in SCRIPTS
|
script(src="/assets/js/main.js?v#{V_JS}", type="text/javascript")
|
||||||
script(src="/assets/js/" + script + ".js?v1", type="text/javascript")
|
script(src="/assets/js/prism.js", type="text/javascript")
|
||||||
|
|
||||||
if environment == "deploy"
|
if environment == "deploy"
|
||||||
script
|
script
|
||||||
|
|
|
@ -60,7 +60,7 @@
|
||||||
background: $color-back
|
background: $color-back
|
||||||
border-radius: 2px
|
border-radius: 2px
|
||||||
border: 1px solid $color-subtle
|
border: 1px solid $color-subtle
|
||||||
padding: 3.5%
|
padding: 3.5% 2.5%
|
||||||
|
|
||||||
//- Icons
|
//- Icons
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue
Block a user