diff --git a/examples/information_extraction/entity_relations.py b/examples/information_extraction/entity_relations.py index b73dcbf3b..47b20057c 100644 --- a/examples/information_extraction/entity_relations.py +++ b/examples/information_extraction/entity_relations.py @@ -1,7 +1,6 @@ #!/usr/bin/env python # coding: utf8 -""" -A simple example of extracting relations between phrases and entities using +"""A simple example of extracting relations between phrases and entities using spaCy's named entity recognizer and the dependency parse. Here, we extract money and currency values (entities labelled as MONEY) and then check the dependency tree to find the noun phrase they are referring to – for example: diff --git a/examples/information_extraction/parse_subtrees.py b/examples/information_extraction/parse_subtrees.py index 5963d014c..2a258b31d 100644 --- a/examples/information_extraction/parse_subtrees.py +++ b/examples/information_extraction/parse_subtrees.py @@ -1,8 +1,7 @@ #!/usr/bin/env python # coding: utf8 -""" -This example shows how to navigate the parse tree including subtrees attached -to a word. +"""This example shows how to navigate the parse tree including subtrees +attached to a word. Based on issue #252: "In the documents and tutorials the main thing I haven't found is diff --git a/examples/information_extraction/phrase_matcher.py b/examples/information_extraction/phrase_matcher.py index 2dd2691b9..0b5bcdc7f 100644 --- a/examples/information_extraction/phrase_matcher.py +++ b/examples/information_extraction/phrase_matcher.py @@ -1,9 +1,10 @@ +#!/usr/bin/env python +# coding: utf8 """Match a large set of multi-word expressions in O(1) time. The idea is to associate each word in the vocabulary with a tag, noting whether they begin, end, or are inside at least one pattern. An additional tag is used for single-word patterns. Complete patterns are also stored in a hash set. - When we process a document, we look up the words in the vocabulary, to associate the words with the tags. We then search for tag-sequences that correspond to valid candidates. Finally, we look up the candidates in the hash diff --git a/examples/pipeline/multi_processing.py b/examples/pipeline/multi_processing.py index 19b1c462a..99bb9c53f 100644 --- a/examples/pipeline/multi_processing.py +++ b/examples/pipeline/multi_processing.py @@ -1,5 +1,6 @@ -""" -Example of multi-processing with Joblib. Here, we're exporting +#!/usr/bin/env python +# coding: utf8 +"""Example of multi-processing with Joblib. Here, we're exporting part-of-speech-tagged, true-cased, (very roughly) sentence-separated text, with each "sentence" on a newline, and spaces between tokens. Data is loaded from the IMDB movie reviews dataset and will be loaded automatically via Thinc's diff --git a/examples/training/train_ner.py b/examples/training/train_ner.py index 499807d23..e95cce4c9 100644 --- a/examples/training/train_ner.py +++ b/examples/training/train_ner.py @@ -1,7 +1,6 @@ #!/usr/bin/env python # coding: utf8 -""" -Example of training spaCy's named entity recognizer, starting off with an +"""Example of training spaCy's named entity recognizer, starting off with an existing model or a blank model. For more details, see the documentation: diff --git a/examples/training/train_new_entity_type.py b/examples/training/train_new_entity_type.py index ec1e562c6..1c70f7c03 100644 --- a/examples/training/train_new_entity_type.py +++ b/examples/training/train_new_entity_type.py @@ -1,7 +1,6 @@ #!/usr/bin/env python # coding: utf8 -""" -Example of training an additional entity type +"""Example of training an additional entity type This script shows how to add a new entity type to an existing pre-trained NER model. To keep the example short and simple, only four sentences are provided diff --git a/examples/training/train_parser.py b/examples/training/train_parser.py index a23d73ec7..e321fdb1e 100644 --- a/examples/training/train_parser.py +++ b/examples/training/train_parser.py @@ -1,10 +1,7 @@ #!/usr/bin/env python # coding: utf8 -""" -Example of training spaCy dependency parser, starting off with an existing model -or a blank model. - -For more details, see the documentation: +"""Example of training spaCy dependency parser, starting off with an existing +model or a blank model. For more details, see the documentation: * Training: https://alpha.spacy.io/usage/training * Dependency Parse: https://alpha.spacy.io/usage/linguistic-features#dependency-parse diff --git a/examples/training/train_tagger.py b/examples/training/train_tagger.py index c6fc1de88..7508c2e66 100644 --- a/examples/training/train_tagger.py +++ b/examples/training/train_tagger.py @@ -3,9 +3,8 @@ """ A simple example for training a part-of-speech tagger with a custom tag map. To allow us to update the tag map with our custom one, this example starts off -with a blank Language class and modifies its defaults. - -For more details, see the documentation: +with a blank Language class and modifies its defaults. For more details, see +the documentation: * Training: https://alpha.spacy.io/usage/training * POS Tagging: https://alpha.spacy.io/usage/linguistic-features#pos-tagging diff --git a/examples/training/train_textcat.py b/examples/training/train_textcat.py index 1f9cd29aa..fc9610a66 100644 --- a/examples/training/train_textcat.py +++ b/examples/training/train_textcat.py @@ -3,9 +3,8 @@ """Train a multi-label convolutional neural network text classifier on the IMDB dataset, using the TextCategorizer component. The dataset will be loaded automatically via Thinc's built-in dataset loader. The model is added to -spacy.pipeline, and predictions are available via `doc.cats`. - -For more details, see the documentation: +spacy.pipeline, and predictions are available via `doc.cats`. For more details, +see the documentation: * Training: https://alpha.spacy.io/usage/training * Text classification: https://alpha.spacy.io/usage/text-classification diff --git a/examples/vectors_fast_text.py b/examples/vectors_fast_text.py index 159250098..d14f6724f 100644 --- a/examples/vectors_fast_text.py +++ b/examples/vectors_fast_text.py @@ -13,8 +13,7 @@ import from spacy.language import Language @plac.annotations( vectors_loc=("Path to vectors", "positional", None, str)) def main(vectors_loc): - nlp = Language() - + nlp = Language() # start off with a blank Language class with open(vectors_loc, 'rb') as file_: header = file_.readline() nr_row, nr_dim = header.split() @@ -24,9 +23,11 @@ def main(vectors_loc): pieces = line.split() word = pieces[0] vector = numpy.asarray([float(v) for v in pieces[1:]], dtype='f') - nlp.vocab.set_vector(word, vector) - doc = nlp(u'class colspan') - print(doc[0].similarity(doc[1])) + nlp.vocab.set_vector(word, vector) # add the vectors to the vocab + # test the vectors and similarity + text = 'class colspan' + doc = nlp(text) + print(text, doc[0].similarity(doc[1])) if __name__ == '__main__':