Merge branch 'master' of ssh://github.com/explosion/spaCy

2025-01-11 17:56:30 +03:00 · 2016-11-01 03:05:49 +01:00 · 2016-11-01 03:05:49 +01:00 · 18aab4f71e
commit 18aab4f71e
parent 45ebab4677 6cf989ad26
5 changed files with 89 additions and 68 deletions
--- a/examples/keras_parikh_entailment/README.md
+++ b/examples/keras_parikh_entailment/README.md
@ -1,77 +1,98 @@
+<a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a>
+
 # A Decomposable Attention Model for Natural Language Inference
+**by Matthew Honnibal, [@honnibal](https://github.com/honnibal)**

 This directory contains an implementation of entailment prediction model described
-by Parikh et al. (2016). The model is notable for its competitive performance
-with very few parameters.
+by [Parikh et al. (2016)](https://arxiv.org/pdf/1606.01933.pdf). The model is notable 
+for its competitive performance with very few parameters.

-https://arxiv.org/pdf/1606.01933.pdf
+The model is implemented using [Keras](https://keras.io/) and [spaCy](https://spacy.io). 
+Keras is used to build and train the network, while spaCy is used to load 
+the [GloVe](http://nlp.stanford.edu/projects/glove/) vectors, perform the 
+feature extraction, and help you apply the model at run-time. The following 
+demo code shows how the entailment model  can be used at runtime, once the 
+hook is installed to customise the `.similarity()` method of spaCy's `Doc` 
+and `Span` objects:

-The model is implemented using Keras and spaCy. Keras is used to build and
-train the network, while spaCy is used to load the GloVe vectors, perform the
-feature extraction, and help you apply the model at run-time. The following
-demo code shows how the entailment model can be used at runtime, once the hook is
-installed to customise the `.similarity()` method of spaCy's `Doc` and `Span`
-objects:
-
-    def demo(model_dir):
-        nlp = spacy.load('en', path=model_dir,
-                create_pipeline=create_similarity_pipeline)
-        doc1 = nlp(u'Worst fries ever! Greasy and horrible...')
-        doc2 = nlp(u'The milkshakes are good. The fries are bad.')
-        print(doc1.similarity(doc2))
-        sent1a, sent1b = doc1.sents
-        print(sent1a.similarity(sent1b))
-        print(sent1a.similarity(doc2))
-        print(sent1b.similarity(doc2))
+```python
+def demo(model_dir):
+    nlp = spacy.load('en', path=model_dir,
+            create_pipeline=create_similarity_pipeline)
+    doc1 = nlp(u'Worst fries ever! Greasy and horrible...')
+    doc2 = nlp(u'The milkshakes are good. The fries are bad.')
+    print(doc1.similarity(doc2))
+    sent1a, sent1b = doc1.sents
+    print(sent1a.similarity(sent1b))
+    print(sent1a.similarity(doc2))
+    print(sent1b.similarity(doc2))
+```

 I'm working on a blog post to explain Parikh et al.'s model in more detail.
 I think it is a very interesting example of the attention mechanism, which
-I didn't understand very well before working through this paper.
+I didn't understand very well before working through this paper. There are
+lots of ways to extend the model.

-# How to run the example
+## What's where

-1. Install spaCy and its English models (about 1GB of data):
+* `keras_parikh_entailment/__main__.py`: The script that will be executed.
+    Defines the CLI, the data reading, etc — all the boring stuff.
+    
+* `keras_parikh_entailment/spacy_hook.py`: Provides a class `SimilarityShim`
+    that lets you use an arbitrary function to customize spaCy's
+    `doc.similarity()` method. Instead of the default average-of-vectors algorithm,
+    when you call `doc1.similarity(doc2)`, you'll get the result of `your_model(doc1, doc2)`.
+    
+* `keras_parikh_entailment/keras_decomposable_attention.py`: Defines the neural network model.
+    This part knows nothing of spaCy --- its ordinary Keras usage.

-    pip install spacy
-    python -m spacy.en.download
+## Setting up

-This will give you the spaCy's tokenization, tagging, NER and parsing models,
-as well as the GloVe word vectors.
+First, install keras, spaCy and the spaCy English models (about 1GB of data):

-2. Install Keras
+```bash
+pip install keras spacy
+python -m spacy.en.download
+```

-    pip install keras
+You'll also want to get keras working on your GPU. This will depend on your
+set up, so you're mostly on your own for this step. If you're using AWS, try the NVidia
+AMI. It made things pretty easy.

-3. Get Keras working with your GPU
+Once you've installed the dependencies, you can run a small preliminary test of
+the keras model:

-You're mostly on your own here. My only advice is, if you're setting up on AWS,
-try using the AMI published by NVidia. With the image, getting everything set
-up wasn't *too* painful. 
+```bash
+py.test keras_parikh_entailment/keras_decomposable_attention.py
+```

-4. Test the Keras model:
+This compiles the model and fits it with some dummy data. You should see that
+both tests passed.

-    py.test nli/keras_decomposable_attention.py
+Finally, download the Stanford Natural Language Inference corpus.

-This should tell you that two tests passed.
+Source: http://nlp.stanford.edu/projects/snli/

-5. Download the Stanford Natural Language Inference data
+## Running the example

-http://nlp.stanford.edu/projects/snli/
+You can run the `keras_parikh_entailment/` directory as a script, which executes the file
+`keras_parikh_entailment/__main__.py`. The first thing you'll want to do is train the model:

-6. Train the model:
-
-    python nli/ train <your_model_dir> <train_directory> <dev_directory>
+```bash
+python keras_parikh_entailment/ train <your_model_dir> <train_directory> <dev_directory>
+```

 Training takes about 300 epochs for full accuracy, and I haven't rerun the full
-experiment since refactoring things to publish this example --- please let me
-know if I've broken something.
+experiment since refactoring things to publish this example — please let me
+know if I've broken something. You should get to at least 85% on the development data.

-You should get to at least 85% on the development data.
+The other two modes demonstrate run-time usage. I never like relying on the accuracy printed
+by `.fit()` methods. I never really feel confident until I've run a new process that loads
+the model and starts making predictions, without access to the gold labels. I've therefore
+included an `evaluate` mode. Finally, there's also a little demo, which mostly exists to show
+you how run-time usage will eventually look.

-7. Evaluate the model (optional):
+## Getting updates

-    python nli/ evaluate <your_model_dir> <dev_directory>
-
-8. Run the demo (optional):
-
-    python nli/ demo <your_model_dir>
+We should have the blog post explaining the model ready before the end of the week. To get
+notified when it's published, you can either the follow me on Twitter, or subscribe to our mailing list.
--- a/website/README.md
+++ b/website/README.md
@ -27,7 +27,7 @@ The docs can always use another example or more detail, and they should always b

 While all page content lives in the `.jade` files, article meta (page titles, sidebars etc.) is stored as JSON. Each folder contains a `_data.json` with all required meta for its files.

-For simplicity, all sites linked in the [tutorials](https://spacy.io/docs/usage/tutorials) and [showcase](https://spacy.io/docs/usage/showcase) are also stored as JSON. So in order to edit those pages, there's no need to dig into the Jade files – simply edit the [`_data.json`](website/docs/usage/_data.json).
+For simplicity, all sites linked in the [tutorials](https://spacy.io/docs/usage/tutorials) and [showcase](https://spacy.io/docs/usage/showcase) are also stored as JSON. So in order to edit those pages, there's no need to dig into the Jade files – simply edit the [`_data.json`](docs/usage/_data.json).

 ### Markup language and conventions

@ -54,7 +54,7 @@ Note that for external links, `+a("...")` is used instead of `a(href="...")` –

 ### Mixins

-Each file includes a collection of [custom mixins](website/_includes/_mixins.jade) that make it easier to add content components – no HTML or class names required.
+Each file includes a collection of [custom mixins](_includes/_mixins.jade) that make it easier to add content components – no HTML or class names required.

 For example:
 ```pug
@ -89,7 +89,7 @@ Code blocks are implemented using the `+code` or `+aside-code` (to display them
    en_doc = en_nlp(u'Hello, world. Here are two sentences.')
 ```

-You can find the documentation for the available mixins in [`_includes/_mixins.jade`](website/_includes/_mixins.jade).
+You can find the documentation for the available mixins in [`_includes/_mixins.jade`](_includes/_mixins.jade).

 ### Linking to the Github repo

--- a/website/_harp.json
+++ b/website/_harp.json
@ -11,7 +11,6 @@
        "COMPANY": "Explosion AI",
        "COMPANY_URL": "https://explosion.ai",
        "DEMOS_URL": "https://demos.explosion.ai",
-
        "SPACY_VERSION": "1.1",

        "SOCIAL": {
@ -20,15 +19,6 @@
            "reddit": "spacynlp"
        },

-        "SCRIPTS" : [ "main", "prism" ],
-        "DEFAULT_SYNTAX" : "python",
-        "ANALYTICS": "UA-58931649-1",
-        "MAILCHIMP": {
-            "user": "spacy.us12",
-            "id": "83b0498b1e7fa3c91ce68c3f1",
-            "list": "89ad33e698"
-        },
-
        "NAVIGATION": {
            "Home": "/",
            "Docs": "/docs",
@ -55,6 +45,16 @@
                "Blog": "https://explosion.ai/blog",
                "Contact": "mailto:contact@explosion.ai"
            }
+        },
+
+        "V_CSS": "1.4",
+        "V_JS": "1.0",
+        "DEFAULT_SYNTAX" : "python",
+        "ANALYTICS": "UA-58931649-1",
+        "MAILCHIMP": {
+            "user": "spacy.us12",
+            "id": "83b0498b1e7fa3c91ce68c3f1",
+            "list": "89ad33e698"
        }

    }
--- a/website/_layout.jade
+++ b/website/_layout.jade
@ -37,10 +37,10 @@ html(lang="en")
    link(rel="icon" type="image/x-icon" href="/assets/img/favicon.ico")

    if SUBSECTION == "usage"
-        link(href="/assets/css/style_red.css?v1" rel="stylesheet")
+        link(href="/assets/css/style_red.css?v#{V_CSS}" rel="stylesheet")

    else
-        link(href="/assets/css/style.css?v1" rel="stylesheet")
+        link(href="/assets/css/style.css?v#{V_CSS}" rel="stylesheet")

    body
        include _includes/_navigation
@ -52,8 +52,8 @@ html(lang="en")
            main!=yield
                include _includes/_footer

-        each script in SCRIPTS
-            script(src="/assets/js/" + script + ".js?v1", type="text/javascript")
+        script(src="/assets/js/main.js?v#{V_JS}", type="text/javascript")
+        script(src="/assets/js/prism.js", type="text/javascript")

        if environment == "deploy"
            script
--- a/website/assets/css/_base/_objects.sass
+++ b/website/assets/css/_base/_objects.sass
@ -60,7 +60,7 @@
    background: $color-back
    border-radius: 2px
    border: 1px solid $color-subtle
-    padding: 3.5%
+    padding: 3.5% 2.5%

 //- Icons