d33953037e
* Create aryaprabhudesai.md (#2681) * Update _install.jade (#2688) Typo fix: "models" -> "model" * Add FAC to spacy.explain (resolves #2706) * Remove docstrings for deprecated arguments (see #2703) * When calling getoption() in conftest.py, pass a default option (#2709) * When calling getoption() in conftest.py, pass a default option This is necessary to allow testing an installed spacy by running: pytest --pyargs spacy * Add contributor agreement * update bengali token rules for hyphen and digits (#2731) * Less norm computations in token similarity (#2730) * Less norm computations in token similarity * Contributor agreement * Remove ')' for clarity (#2737) Sorry, don't mean to be nitpicky, I just noticed this when going through the CLI and thought it was a quick fix. That said, if this was intention than please let me know. * added contributor agreement for mbkupfer (#2738) * Basic support for Telugu language (#2751) * Lex _attrs for polish language (#2750) * Signed spaCy contributor agreement * Added polish version of english lex_attrs * Introduces a bulk merge function, in order to solve issue #653 (#2696) * Fix comment * Introduce bulk merge to increase performance on many span merges * Sign contributor agreement * Implement pull request suggestions * Describe converters more explicitly (see #2643) * Add multi-threading note to Language.pipe (resolves #2582) [ci skip] * Fix formatting * Fix dependency scheme docs (closes #2705) [ci skip] * Don't set stop word in example (closes #2657) [ci skip] * Add words to portuguese language _num_words (#2759) * Add words to portuguese language _num_words * Add words to portuguese language _num_words * Update Indonesian model (#2752) * adding e-KTP in tokenizer exceptions list * add exception token * removing lines with containing space as it won't matter since we use .split() method in the end, added new tokens in exception * add tokenizer exceptions list * combining base_norms with norm_exceptions * adding norm_exception * fix double key in lemmatizer * remove unused import on punctuation.py * reformat stop_words to reduce number of lines, improve readibility * updating tokenizer exception * implement is_currency for lang/id * adding orth_first_upper in tokenizer_exceptions * update the norm_exception list * remove bunch of abbreviations * adding contributors file * Fixed spaCy+Keras example (#2763) * bug fixes in keras example * created contributor agreement * Adding French hyphenated first name (#2786) * Fix typo (closes #2784) * Fix typo (#2795) [ci skip] Fixed typo on line 6 "regcognizer --> recognizer" * Adding basic support for Sinhala language. (#2788) * adding Sinhala language package, stop words, examples and lex_attrs. * Adding contributor agreement * Updating contributor agreement * Also include lowercase norm exceptions * Fix error (#2802) * Fix error ValueError: cannot resize an array that references or is referenced by another array in this way. Use the resize function * added spaCy Contributor Agreement * Add charlax's contributor agreement (#2805) * agreement of contributor, may I introduce a tiny pl languge contribution (#2799) * Contributors agreement * Contributors agreement * Contributors agreement * Add jupyter=True to displacy.render in documentation (#2806) * Revert "Also include lowercase norm exceptions" This reverts commit |
||
---|---|---|
.. | ||
__main__.py | ||
keras_decomposable_attention.py | ||
README.md | ||
spacy_hook.py |
A decomposable attention model for Natural Language Inference
by Matthew Honnibal, @honnibal Updated for spaCy 2.0+ and Keras 2.2.2+ by John Stewart, @free-variation
This directory contains an implementation of the entailment prediction model described by Parikh et al. (2016). The model is notable for its competitive performance with very few parameters.
The model is implemented using Keras and spaCy.
Keras is used to build and train the network. spaCy is used to load
the GloVe vectors, perform the
feature extraction, and help you apply the model at run-time. The following
demo code shows how the entailment model can be used at runtime, once the
hook is installed to customise the .similarity()
method of spaCy's Doc
and Span
objects:
def demo(shape):
nlp = spacy.load('en_vectors_web_lg')
nlp.add_pipe(KerasSimilarityShim.load(nlp.path / 'similarity', nlp, shape[0]))
doc1 = nlp(u'The king of France is bald.')
doc2 = nlp(u'France has no king.')
print("Sentence 1:", doc1)
print("Sentence 2:", doc2)
entailment_type, confidence = doc1.similarity(doc2)
print("Entailment type:", entailment_type, "(Confidence:", confidence, ")")
Which gives the output Entailment type: contradiction (Confidence: 0.60604566)
, showing that
the system has definite opinions about Betrand Russell's famous conundrum!
I'm working on a blog post to explain Parikh et al.'s model in more detail. A notebook is available that briefly explains this implementation. I think it is a very interesting example of the attention mechanism, which I didn't understand very well before working through this paper. There are lots of ways to extend the model.
What's where
File | Description |
---|---|
__main__.py |
The script that will be executed. Defines the CLI, the data reading, etc — all the boring stuff. |
spacy_hook.py |
Provides a class KerasSimilarityShim that lets you use an arbitrary function to customize spaCy's doc.similarity() method. Instead of the default average-of-vectors algorithm, when you call doc1.similarity(doc2) , you'll get the result of your_model(doc1, doc2) . |
keras_decomposable_attention.py |
Defines the neural network model. |
Setting up
First, install Keras, spaCy and the spaCy English models (about 1GB of data):
pip install keras
pip install spacy
python -m spacy download en_vectors_web_lg
You'll also want to get Keras working on your GPU, and you will need a backend, such as TensorFlow or Theano. This will depend on your set up, so you're mostly on your own for this step. If you're using AWS, try the NVidia AMI. It made things pretty easy.
Once you've installed the dependencies, you can run a small preliminary test of the Keras model:
py.test keras_parikh_entailment/keras_decomposable_attention.py
This compiles the model and fits it with some dummy data. You should see that both tests passed.
Finally, download the Stanford Natural Language Inference corpus.
Running the example
You can run the keras_parikh_entailment/
directory as a script, which executes the file
keras_parikh_entailment/__main__.py
. If you run the script without arguments
the usage is shown. Running it with -h
explains the command line arguments.
The first thing you'll want to do is train the model:
python keras_parikh_entailment/ train -t <path to SNLI train JSON> -s <path to SNLI dev JSON>
Training takes about 300 epochs for full accuracy, and I haven't rerun the full experiment since refactoring things to publish this example — please let me know if I've broken something. You should get to at least 85% on the development data even after 10-15 epochs.
The other two modes demonstrate run-time usage. I never like relying on the accuracy printed
by .fit()
methods. I never really feel confident until I've run a new process that loads
the model and starts making predictions, without access to the gold labels. I've therefore
included an evaluate
mode.
python keras_parikh_entailment/ evaluate -s <path to SNLI train JSON>
Finally, there's also a little demo, which mostly exists to show you how run-time usage will eventually look.
python keras_parikh_entailment/ demo
Getting updates
We should have the blog post explaining the model ready before the end of the week. To get notified when it's published, you can either follow me on Twitter or subscribe to our mailing list.